Parsing csv file with commas and quotes as deliminators
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
You may wish to do a search and replace of all commas within the valid bounds of the quotation marks first(with a ~ perhaps ), and then use the resulting string to do the split. After you have done that you could just do a search and replace on all the split elements to replace the tildes with a comma again.
I wasn't, now I am, then I won't be anymore.
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
Please do not use this solution - the FileHelper one is much better and easier to use / understand. You can't do that easily with string.Split - it only works on a single character. You will either have to do it manually or use a regex:
string inputText = @"1530,Pasadena CA,""2008, 05/01"",""2005, 12/14"""; Regex regex = new Regex("(?=,)|\[^\\",\]+|\\"(?:\[^\\"\]|\\"\\")\*\\"", RegexOptions.Compiled); MatchCollection ms = regex.Matches(inputText); foreach (Match m in ms) { if (m.Length > 0) { Console.WriteLine(m.Value); } }
The regex looks ugly, but it was thrown together a bit quickly - that's why it generates blank matches.
Real men don't use instructions. They are only the manufacturers opinion on how to put the thing together.
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
you may want and read the Rive[^] article PIEBALD created. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.
-
you may want and read the Rive[^] article PIEBALD created. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.
10! :-D
-
10! :-D
PIEBALDconsult wrote:
10!
That is 5 each. :-D
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
Observe the coding principle of DRY (don't repeat yourself), in fact don't repeat what others have already done either. If at all possible, try using the FileHelpers library. It does nearly everything you could possibly need on CSV processing. I recently used this and found it great.
-
Observe the coding principle of DRY (don't repeat yourself), in fact don't repeat what others have already done either. If at all possible, try using the FileHelpers library. It does nearly everything you could possibly need on CSV processing. I recently used this and found it great.
Also available as an article FileHelpers v2.0 - Delimited (CSV) or Fixed Data Import/Export Framework[^] Thanks for this Brady!
Real men don't use instructions. They are only the manufacturers opinion on how to put the thing together.
-
So Im reading a csv file and splitting the string with "," as the deliminator but some of them have quotes as to not split the specific field because it has a comma in it. 1530,Pasadena CA,"2008, 05/01","2005, 12/14" with just comma it would be: 1530 Pasadena CA "2008 05/01" "2005 12/14" I need it to take commas into consideration when splitting so its like this 1530 Pasadena CA "2008 05/01" "2005 12/14"
Please ignore my earlier solution - I have been playing with the one Brady supplied, and I am impressed how easily and how well it works: Add a reference to FileHelper. Add a class:
using System;
using FileHelpers;namespace GUITester
{
/// <summary>
/// Dummy class to check FileHelper functionality.
/// </summary>
[DelimitedRecord(",")]
public class Customer
{
#region Constants
#endregion#region Fields #region Internal #endregion #region Property bases #endregion #region FileHelper interaction public int Id; public string Location; \[FieldQuoted(), FieldConverter(ConverterKind.Date, "yyyy, MM/dd")\] public DateTime AccessDate; \[FieldQuoted(), FieldConverter(ConverterKind.Date, "yyyy, MM/dd")\] public DateTime CreateDate; #endregion #endregion #region Properties #endregion #region Regular Expressions #endregion #region Enums #endregion #region Constructors #endregion #region Events #region Event Constructors #endregion #region Event Handlers #endregion #endregion #region Public Methods #endregion #region Overrides #endregion #region Private Methods #endregion } }
Add code to your form:
using FileHelpers;
...
FileHelperEngine engine = new FileHelperEngine(typeof(Customer));
Customer[] customers = engine.ReadFile(@"F:\Temp\Records.txt") as Customer[];
foreach (Customer c in customers)
{
Console.WriteLine(c.Id + ": ");
Console.WriteLine(" >" + c.Location);
Console.WriteLine(" >" + c.AccessDate);
Console.WriteLine(" >" + c.CreateDate);
}And by George it works! I am very impressed indeed. If you use it, vote the article a five, it deserves it...
Real men don't use instructions. They are only the manufacturers opinion on how to put the thing together.
-
Please ignore my earlier solution - I have been playing with the one Brady supplied, and I am impressed how easily and how well it works: Add a reference to FileHelper. Add a class:
using System;
using FileHelpers;namespace GUITester
{
/// <summary>
/// Dummy class to check FileHelper functionality.
/// </summary>
[DelimitedRecord(",")]
public class Customer
{
#region Constants
#endregion#region Fields #region Internal #endregion #region Property bases #endregion #region FileHelper interaction public int Id; public string Location; \[FieldQuoted(), FieldConverter(ConverterKind.Date, "yyyy, MM/dd")\] public DateTime AccessDate; \[FieldQuoted(), FieldConverter(ConverterKind.Date, "yyyy, MM/dd")\] public DateTime CreateDate; #endregion #endregion #region Properties #endregion #region Regular Expressions #endregion #region Enums #endregion #region Constructors #endregion #region Events #region Event Constructors #endregion #region Event Handlers #endregion #endregion #region Public Methods #endregion #region Overrides #endregion #region Private Methods #endregion } }
Add code to your form:
using FileHelpers;
...
FileHelperEngine engine = new FileHelperEngine(typeof(Customer));
Customer[] customers = engine.ReadFile(@"F:\Temp\Records.txt") as Customer[];
foreach (Customer c in customers)
{
Console.WriteLine(c.Id + ": ");
Console.WriteLine(" >" + c.Location);
Console.WriteLine(" >" + c.AccessDate);
Console.WriteLine(" >" + c.CreateDate);
}And by George it works! I am very impressed indeed. If you use it, vote the article a five, it deserves it...
Real men don't use instructions. They are only the manufacturers opinion on how to put the thing together.
I take it you're a regions fan. :laugh:
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.
-
I take it you're a regions fan. :laugh:
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.
Yes, to an extent. It's my standard boilerplate class file template - it just means I can collapse everything I'm not interested in at the moment without having too many tabs open at the top of the IDE. Can't stick it when you get regions inside a method though... :laugh:
Real men don't use instructions. They are only the manufacturers opinion on how to put the thing together.
-
Observe the coding principle of DRY (don't repeat yourself), in fact don't repeat what others have already done either. If at all possible, try using the FileHelpers library. It does nearly everything you could possibly need on CSV processing. I recently used this and found it great.
Personally, I prefer a separation of responsibilities -- one method to read the data, another to split it, another to parse it, etc. Some times I allow CSV or other text file lines to be commented-out (e.g. first character a semi-colon) -- I don't want to split or parse them. I haven't really looked at FileHelpers, but does it support newlines within values?
-
Personally, I prefer a separation of responsibilities -- one method to read the data, another to split it, another to parse it, etc. Some times I allow CSV or other text file lines to be commented-out (e.g. first character a semi-colon) -- I don't want to split or parse them. I haven't really looked at FileHelpers, but does it support newlines within values?
I haven't worked with FileHelpers for a few months, and have lost touch a bit, but AFAIR it caters for both your requirements. As for SoR, I also think it allows you to handle your own reading and pass it data only for splitting.