Unicode issues with SQL Server & JSON

AshleyBarnes · November 27, 2023, 7:14pm

We will be using the rest API to download a couple of JSON files, and then processing the JSON into some tables via Azure Data Factory. While that is being built, I want to process the JSON from the API manually via SQL Servers OPENROWSET and OPENJSON functions.

When I call OPENROWSET on SQL Server like this:

;WITH cte_results
AS
(
SELECT
b.[key] as row_Key,
c.[key] as Col_key,
c.value as Col_data
FROM
OPENROWSET (BULK ‘path\search.json’, SINGLE_NCLOB) as j
CROSS APPLY OPENJSON(BulkColumn) as b
CROSS APPLY OPENJSON(b.Value) as c
)
SELECT * FROM cte_results

I get an error saying Msg 4809, Level 16, State 1, Line 1
SINGLE_NCLOB requires a UNICODE (widechar) input file. The file specified is not Unicode.

This is a mystery to me, because according to Notepad++ the file is UTF-8, and UTF-8 is Unicode. I’ve played around with alternate settings in place of the SINGLE_NCLOB constant, but nothing works so far. I’ve tried reducing the data down to just one record in case a single record somewhere in the file has some obscure character. I’ve also tried saving the file by copy pasting the JSON, saving from Edge directly, and using Notepad++ to change the encoding back and forth.

When I look at the data, it appears to be quite normal, no bad or odd characters that are visible.

I’ve also pasted the data into, and over top of, a JSON file that was known to be working. I’ve also tried to make sure there is no BOM character at the start of the file

Note: I have redacted the actual path. It’s a UNC path, for which I can open other non-JIRA JSON files currently.