Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. RegEx Split

RegEx Split

Scheduled Pinned Locked Moved Regular Expressions
comdata-structuresregexhelptutorial
2 Posts 2 Posters 9 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Member_14877474
    wrote on last edited by
    #1

    Hi I have a line in my csv file as below

    ""|*"I have delimiter |* and an escaped \" quote in me"|*100|*200|*300|*"am a string"|*""

    I have to interpret " quote as text-qualifier and |* as delimiter. I have to ignore escaped quote \" and consider it part of the string. 100, 200, 300 are integer data fields, so, they are not surrounded by text-qualifier. The expected result is an array of strings. a[0] = "" which is a Null string a[1] = "I have delimiter |* and an escaped \" quote in me" a[2] = "100" a[3] = "200" a[4] = "300" a[5] = "am a string" a[6] = "" which is a Null string Code is as below, it looks like \" is not getting escaped properly, could you please let me know how to fix this, thanks. The RegularExpression code is as in here: Split Function that Supports Text Qualifiers[^]

    using System.Text.RegularExpressions;

    public string[] Split(string expression, string delimiter,
    string qualifier, bool ignoreCase)
    {
    string _Statement = String.Format
    ("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",
    Regex.Escape(delimiter), Regex.Escape(qualifier));

    RegexOptions \_Options = RegexOptions.Compiled | RegexOptions.Multiline;
    if (ignoreCase) \_Options = \_Options | RegexOptions.IgnoreCase;
    
    Regex \_Expression = New Regex(\_Statement, \_Options);
    return \_Expression.Split(expression);
    

    }

    Richard DeemingR 1 Reply Last reply
    0
    • M Member_14877474

      Hi I have a line in my csv file as below

      ""|*"I have delimiter |* and an escaped \" quote in me"|*100|*200|*300|*"am a string"|*""

      I have to interpret " quote as text-qualifier and |* as delimiter. I have to ignore escaped quote \" and consider it part of the string. 100, 200, 300 are integer data fields, so, they are not surrounded by text-qualifier. The expected result is an array of strings. a[0] = "" which is a Null string a[1] = "I have delimiter |* and an escaped \" quote in me" a[2] = "100" a[3] = "200" a[4] = "300" a[5] = "am a string" a[6] = "" which is a Null string Code is as below, it looks like \" is not getting escaped properly, could you please let me know how to fix this, thanks. The RegularExpression code is as in here: Split Function that Supports Text Qualifiers[^]

      using System.Text.RegularExpressions;

      public string[] Split(string expression, string delimiter,
      string qualifier, bool ignoreCase)
      {
      string _Statement = String.Format
      ("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",
      Regex.Escape(delimiter), Regex.Escape(qualifier));

      RegexOptions \_Options = RegexOptions.Compiled | RegexOptions.Multiline;
      if (ignoreCase) \_Options = \_Options | RegexOptions.IgnoreCase;
      
      Regex \_Expression = New Regex(\_Statement, \_Options);
      return \_Expression.Split(expression);
      

      }

      Richard DeemingR Offline
      Richard DeemingR Offline
      Richard Deeming
      wrote on last edited by
      #2

      Your function works fine for me if you pass in the correct values.

      const string input = "\"\"|*\"I have delimiter |* and an escaped \\\" quote in me\"|*100|*200|*300|*\"am a string\"|*\"\"";
      string[] result = Split(input, "|*", @"\""", false);

      Split | C# Online Compiler | .NET Fiddle[^]


      "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

      "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups