SSIS – issue of importing Excel data to database

I really want to finish the jQuery series before write anything else, but just cannot help composing this post because it took me hours to solve the issue of importing an Excel sheet to the database.

I have a spread sheet which contains two columns as “Notes” and “Reference”, normally those two columns are filled by explanation words, less than 500 words but some of them exceeded 255 words.

When I first imported from SQL Server Management Studio, it gave the following error “[Excel Source [20]] Error: There was an error with Excel Source.Outputs[Excel Source Output].Columns[Notes] on Excel Source.Outputs[Excel Source Output]. The column status returned was: “Text was truncated or one or more characters had no match in the target code page.”.” which is make sense because some cell’s “notes” data is definitely longer than 255 characters, it seems the solution is simple, just change the field size to a larger one. However even I find every place as I can to change the size to 500, it still gave the upper error. Then I thought maybe it is due to the Management Studio’s restriction, so I create a SSIS package from scratch by VS2010. This SSIS package is very simple as below:

SSISExcelimport_1         SSISExcelimport_2

Actually I tried to add a data conversion component between the source and destination, but it is not the reason of why the truncate error still showing up, so I just ignore that part and keep the description simple.

I did find the place to reset the source input columns definition. Right click the source rectangle and select “show advanced editor” and you will see the screen below by click “Input and Output properties” tab.

SSISExcelimport_3 SSISExcelimport_4

The weird thing is you only be able to change the data type and size under “output columns” but not “External Columns”, actually “External Columns” allow you change everything, but just won’t save it and no alerts (crap…) which confused me for quite a while.

After some research, everything is clear. See the following from MSDN regarding SSIS Excel data driver:

Truncated text. When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples. If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key. For more information, see PRB: Transfer of Data from Jet 4.0 OLEDB Source Fails w/ Error.

That means, SSIS Excel import driver will use the first 10 (default) rows in the sheet for testing, if no data in this column are greater than 255 characters, then it treated this column as a “unicode string [DT_WSTR]” and size would be a firm 255 and cannot be modified under “External Columns”(SSIS looks like too smart…..).

So the solution could be(First of all, ensure your table fields are big enough, for example set to varchar(500) etc.):

1. put a long dummy string into the first cell of that column so that the driver will detected it and treated this column to be a “unicode text stream [DT_NText]”, then you can change the relative column under “Output Columns” to “unicode string [DT_WSTR]” and size could set to 500. Then the importing would pass and no truncate error showing up.

SSISExcelimport_5

2. change HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key to a big number, in this case, the SSIS will detect more rows in Excel. However normally the server registry key won’t be allowed to to touch by developers except you are the super admin…..

3. save this Excel to a csv file, then you will find “External Columns” allow you to modify and save, in this way you can easily to define the data type and size to what you want. However, if there are many double quotes, spaces, single quote, comma etc. inside your “Notes” field, you have to deal with another issue “define the right deliminator to  figure out columns”.

I would recommend the first solution, but it may not fit for some special scenarios. Also please read the full reference from :MSDN: SSIS – Excel Source

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s