Import CSV Problems
-
So, I'm trying to import Semicolon separated files into SQL Server. The files are in codepage 1252 and are "Text Qualified" with a broken bar ¦ In the preview everything looks fine, but in the imported Tables all text fields looks like this: ¦text¦ This wouldn't be a big problem if it wasn't for the fact that some text fields contain semicolons. Then some columns reads:
¦beginning of text
, and the next one:end of text¦
, and the last one might look like this:¦¦;¦¦
If I'm saving the import as a dtsx package and open it in visual studio, the problem remains the same. Is this a known bug or am I doing something wrong? Any solutions that doesn't require me to create my own CSV-import?Wrong is evil and must be defeated. - Jeff Ello
-
So, I'm trying to import Semicolon separated files into SQL Server. The files are in codepage 1252 and are "Text Qualified" with a broken bar ¦ In the preview everything looks fine, but in the imported Tables all text fields looks like this: ¦text¦ This wouldn't be a big problem if it wasn't for the fact that some text fields contain semicolons. Then some columns reads:
¦beginning of text
, and the next one:end of text¦
, and the last one might look like this:¦¦;¦¦
If I'm saving the import as a dtsx package and open it in visual studio, the problem remains the same. Is this a known bug or am I doing something wrong? Any solutions that doesn't require me to create my own CSV-import?Wrong is evil and must be defeated. - Jeff Ello
Why are you using a text qualifier because it is doing exactly what you asked it to.
Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP
-
Why are you using a text qualifier because it is doing exactly what you asked it to.
Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP
I thought the purpose of the text qualifier is to tell the parser that the following chunk of text is in one field despite containing a column delimiter Anyway, If I don't use a text qualifier, it behaves exactly the same, except that the preview now shows the same as the data in the tables. The data our supplier sends looks like this:
¦Column1¦;¦Column2¦;¦Column3¦;¦Column4¦
¦Text¦;¦Some other text¦;¦Text¦;¦Also text¦
¦¦;¦More text¦;¦Text with a ; embedded¦;¦¦
¦etc.¦;¦etc.¦;¦etc.¦;¦...¦What I want in the database is this:
Column1 Column2 Column3 Column4
Text Some other text Text Also text
More text Text with a ; embedded
etc. etc. etc. ...What I get is this:
Column1 ;Column2 Column3 Column4
¦Text¦ ¦Some other text¦ ¦Text¦ ¦Also text¦
¦¦ ¦More text¦ ¦Text with a embedded¦;¦¦
¦etc.¦ ¦etc.¦ ¦etc.¦ ¦...¦Wrong is evil and must be defeated. - Jeff Ello
-
So, I'm trying to import Semicolon separated files into SQL Server. The files are in codepage 1252 and are "Text Qualified" with a broken bar ¦ In the preview everything looks fine, but in the imported Tables all text fields looks like this: ¦text¦ This wouldn't be a big problem if it wasn't for the fact that some text fields contain semicolons. Then some columns reads:
¦beginning of text
, and the next one:end of text¦
, and the last one might look like this:¦¦;¦¦
If I'm saving the import as a dtsx package and open it in visual studio, the problem remains the same. Is this a known bug or am I doing something wrong? Any solutions that doesn't require me to create my own CSV-import?Wrong is evil and must be defeated. - Jeff Ello
-
Sorry to ask what might be a stupid question, You say the file is "Text Qualified" but have you actually set the "Text Qualifier" field in the import config?
There are no stupid questions. (Well ok, but this wasn't one ;P ) I have set the "Text qualifier" field in the General page to: ¦ In the advanced page I have set every field to TextQualified = True What gets at me is that this seem to work as expected in the preview tab.
Wrong is evil and must be defeated. - Jeff Ello
-
There are no stupid questions. (Well ok, but this wasn't one ;P ) I have set the "Text qualifier" field in the General page to: ¦ In the advanced page I have set every field to TextQualified = True What gets at me is that this seem to work as expected in the preview tab.
Wrong is evil and must be defeated. - Jeff Ello
Quote:
What gets at me is that this seem to work as expected in the preview tab.
That would have me tearing my hair out! At which stage I would probably try a reinstall :sigh: However, weird as this may sound (is anything truly weird when it comes to Microsoft??) have you tried not setting TextQualified on the individual fields? (i.e. just on the General page). I'm probably just grasping at straws tbh
-
Quote:
What gets at me is that this seem to work as expected in the preview tab.
That would have me tearing my hair out! At which stage I would probably try a reinstall :sigh: However, weird as this may sound (is anything truly weird when it comes to Microsoft??) have you tried not setting TextQualified on the individual fields? (i.e. just on the General page). I'm probably just grasping at straws tbh
Yes I have. Both suggestions. It also behaves the same if I create an SSIS project in Visual Studio.
Wrong is evil and must be defeated. - Jeff Ello
-
So, I'm trying to import Semicolon separated files into SQL Server. The files are in codepage 1252 and are "Text Qualified" with a broken bar ¦ In the preview everything looks fine, but in the imported Tables all text fields looks like this: ¦text¦ This wouldn't be a big problem if it wasn't for the fact that some text fields contain semicolons. Then some columns reads:
¦beginning of text
, and the next one:end of text¦
, and the last one might look like this:¦¦;¦¦
If I'm saving the import as a dtsx package and open it in visual studio, the problem remains the same. Is this a known bug or am I doing something wrong? Any solutions that doesn't require me to create my own CSV-import?Wrong is evil and must be defeated. - Jeff Ello
Have you tried using the "Import Flat File" wizard, instead of the "Import/Export Data" wizard? It's supposed to be better at this sort of thing... Import Flat File to SQL - SQL Server | Microsoft Docs[^] Otherwise, you might need to resort to Powershell or C#.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Quote:
What gets at me is that this seem to work as expected in the preview tab.
That would have me tearing my hair out! At which stage I would probably try a reinstall :sigh: However, weird as this may sound (is anything truly weird when it comes to Microsoft??) have you tried not setting TextQualified on the individual fields? (i.e. just on the General page). I'm probably just grasping at straws tbh
I just found out that opening the files in Notepad++ as codepage 1252 and saving them as UTF8, and then change the import from 1252 to UTF8, fixes the problem. I'm putting this down as a bug from MS. (That probably won't ever be fixed) The question is, what is the best way to fix it? For a one off I will just open and save all the files and import them as UTF8. But this will be a weekly import later on, preferably run as an SSIS package.
Wrong is evil and must be defeated. - Jeff Ello
-
Have you tried using the "Import Flat File" wizard, instead of the "Import/Export Data" wizard? It's supposed to be better at this sort of thing... Import Flat File to SQL - SQL Server | Microsoft Docs[^] Otherwise, you might need to resort to Powershell or C#.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Yes, the "Import Flat File" wizard doesn't recognize the Text qualifier at all. Meanwhile I have noticed that if I open a file in Notepad++ as codepage 1252 and saving it as UTF8, and then change the import from 1252 to UTF8, it works as advertised. This is OK as a one off, but what is the best way to change the file encoding that can be run from inside an SSIS package?
Wrong is evil and must be defeated. - Jeff Ello
-
Yes, the "Import Flat File" wizard doesn't recognize the Text qualifier at all. Meanwhile I have noticed that if I open a file in Notepad++ as codepage 1252 and saving it as UTF8, and then change the import from 1252 to UTF8, it works as advertised. This is OK as a one off, but what is the best way to change the file encoding that can be run from inside an SSIS package?
Wrong is evil and must be defeated. - Jeff Ello
Jörgen Andersson wrote:
what is the best way to change the file encoding that can be run from inside an SSIS package?
You can add an "execute process" task to execute an external program to change the file encoding. For example, you could use C# to change the encoding: Converting a text file from one encoding to another - Buck Hodges[^]
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Jörgen Andersson wrote:
what is the best way to change the file encoding that can be run from inside an SSIS package?
You can add an "execute process" task to execute an external program to change the file encoding. For example, you could use C# to change the encoding: Converting a text file from one encoding to another - Buck Hodges[^]
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Yes, this seems like the best option.:thumbsup:
Wrong is evil and must be defeated. - Jeff Ello
-
So, I'm trying to import Semicolon separated files into SQL Server. The files are in codepage 1252 and are "Text Qualified" with a broken bar ¦ In the preview everything looks fine, but in the imported Tables all text fields looks like this: ¦text¦ This wouldn't be a big problem if it wasn't for the fact that some text fields contain semicolons. Then some columns reads:
¦beginning of text
, and the next one:end of text¦
, and the last one might look like this:¦¦;¦¦
If I'm saving the import as a dtsx package and open it in visual studio, the problem remains the same. Is this a known bug or am I doing something wrong? Any solutions that doesn't require me to create my own CSV-import?Wrong is evil and must be defeated. - Jeff Ello
This might or might not help ... check what happens using Excel or Access and their X-referencing/inter-app-transferring abilities on specific data table (as .txt; I'm assuming you can morph tables by exporting them with whatever tool you're currently sleuthing the inputs and outputs). In Excel or Access, generally speaking, parsing data using what Microsoft terms a "delimiter" and those various options of which you speak, can often run afoul of it's own preaching (doc code won't run as-printed) ... in which case deploying UDF using the copious functionality of VBA most times does the trick. A favorite workaround of mine is to use Split and InStr, along with Find and Search although logic there becomes draconian quickly. It sounds like you're in the testing stages anyway. [Pleaso oh please ... have Excel andor Access] Final addition: Search Stack Overflow.
-
This might or might not help ... check what happens using Excel or Access and their X-referencing/inter-app-transferring abilities on specific data table (as .txt; I'm assuming you can morph tables by exporting them with whatever tool you're currently sleuthing the inputs and outputs). In Excel or Access, generally speaking, parsing data using what Microsoft terms a "delimiter" and those various options of which you speak, can often run afoul of it's own preaching (doc code won't run as-printed) ... in which case deploying UDF using the copious functionality of VBA most times does the trick. A favorite workaround of mine is to use Split and InStr, along with Find and Search although logic there becomes draconian quickly. It sounds like you're in the testing stages anyway. [Pleaso oh please ... have Excel andor Access] Final addition: Search Stack Overflow.
Excel doesn't handle text qualifiers as far as I can see. And handling stringsplits and such in Excel would indeed quickly become draconian. Access I don't even want to try. ;P
Wrong is evil and must be defeated. - Jeff Ello