While using BULK INSERT statement in SQL Server 2008 to import tab delimited file in a table you might come across the following error:
Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature.
BULK INSERT statement that I used was:
BULK INSERT tblCountries_temp FROM 'c:\utf8_tab_delimited.txt' WITH( DATAFILETYPE = 'widechar', FIELDTERMINATOR = '\t', ROWTERMINATOR = '\n' ) GO
This might be because the file has a UTF-8 type and each record is terminated with new line character (‘\n’). SQL Server 2008 does not support UTF-8, expecting such data in UTF-16 encoding with each record delimited with a carriage return, newline combination (‘\r\n’). Additionally, it needs a Byte Order Marker (BOM) at the beginning of the file.
Follow these steps to easily convert UTF-8 type to UTF-16.
- Open the tab delimited file in SQL Server 2008
- Select File -> Save <File Name> As…
- Select Save with Encoding… option from the save menu as shown below
- In the following window select Encoding as “Unicode (Big-Endian) – Codepage 1201” and Line endings as “Windows (CLRF)”
- You are done!
- Run the BULK INSERT script this time and you will see the results you have been craving for
- Good Luck!