Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Trying to parse legacy data with RegEx [modified]

Trying to parse legacy data with RegEx [modified]

Scheduled Pinned Locked Moved C#
regexcsharpvisual-studiohelpquestion
16 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Lost User

    smesser wrote:

    All can be variable length and there isn't a delimiter Unfortunately, you also can't depend on the tags having no spaces with the colons (i.e. VIN :, vs YRMD:.

    This is true, which makes parsing strings like that pretty hard. I did however find a pattern in this string, but it doesn't necessarily mean that it holds true for other strings. Basically it's always a pair of arbitrary strings separated by a colon, sort of like a key/value-pair. So whenever there's a whitespace left (or also right?) to the colon it doesn't count as a delimiter. Based on this fact the following regex will work: [^\ ]+\ *:[^\ ]+ If you want to also disallow whitespaces right to the colon as a delimiter then this will work: [^\ ]+\ *:\ *[^\ ]+ regards

    S Offline
    S Offline
    Steve Messer
    wrote on last edited by
    #7

    Then what use named groups to get your values? EDIT: private void Test9() { string original = "LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1G1JC12F137230800"; Regex r = new Regex(@"[^\ ]+\ *:\ *[^\ ]+"); MatchCollection theMatches = r.Matches(original); foreach (Match theMatch in theMatches) { Console.WriteLine(theMatch.Value); } }

    modified on Friday, March 21, 2008 12:42 PM

    1 Reply Last reply
    0
    • S Steve Messer

      I need to be able to parse the following line, I am not even sure regualar expression are the best option. LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1G1JC12F137230800 into LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1F1JD12F137230735 And then capture the data after the semicolon. All can be variable length and there isn't a delimiter Unfortunately, you also can't depend on the tags having no spaces with the colons (i.e. VIN :, vs YRMD:. I have been trying to use Grouping to parse this. I am newish to regular expression and thought maybe I should use them to solve this problem. Are regular expression even a good fit for this problem? Any suggestions appreciated.

      modified on Friday, March 21, 2008 12:01 PM

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #8

      This seems to work

      if ( args.Length > 0 )
      {
      //LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1G1JC12F137230800

      System.Text.RegularExpressions.Regex reg = new System.Text.RegularExpressions.Regex
      (
          @"^\\s\*LIC#\\s\*: (?'LIC'.\*)YRMD\\s\*: (?'YRMD'.\*)MAKE\\s\*: (?'MAKE'.\*)BTM\\s\*: (?'BTM'.\*)VIN\\s: (?'VIN'.\*)$"
      ) ;
      
      foreach ( System.Text.RegularExpressions.Match mat in reg.Matches ( args \[ 0 \] ) )
      {
          System.Console.WriteLine
          (
              "LIC# = {0} YRMD = {1} MAKE = {2} BTM = {3} VIN = {4}"
          ,
               mat.Groups \[ "LIC" \].Value
          ,
               mat.Groups \[ "YRMD" \].Value
          ,
               mat.Groups \[ "MAKE" \].Value
          ,
               mat.Groups \[ "BTM" \].Value
          ,
               mat.Groups \[ "VIN" \].Value
          ) ;
      }
      

      }

      Dagnabit! Frowny faces?! Who wrote this crap? I added a SPACE between the : and the ( to solve that little problem, but they should be eliminated from the Regex.

      modified on Friday, March 21, 2008 1:52 PM

      D S 2 Replies Last reply
      0
      • P PIEBALDconsult

        This seems to work

        if ( args.Length > 0 )
        {
        //LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1G1JC12F137230800

        System.Text.RegularExpressions.Regex reg = new System.Text.RegularExpressions.Regex
        (
            @"^\\s\*LIC#\\s\*: (?'LIC'.\*)YRMD\\s\*: (?'YRMD'.\*)MAKE\\s\*: (?'MAKE'.\*)BTM\\s\*: (?'BTM'.\*)VIN\\s: (?'VIN'.\*)$"
        ) ;
        
        foreach ( System.Text.RegularExpressions.Match mat in reg.Matches ( args \[ 0 \] ) )
        {
            System.Console.WriteLine
            (
                "LIC# = {0} YRMD = {1} MAKE = {2} BTM = {3} VIN = {4}"
            ,
                 mat.Groups \[ "LIC" \].Value
            ,
                 mat.Groups \[ "YRMD" \].Value
            ,
                 mat.Groups \[ "MAKE" \].Value
            ,
                 mat.Groups \[ "BTM" \].Value
            ,
                 mat.Groups \[ "VIN" \].Value
            ) ;
        }
        

        }

        Dagnabit! Frowny faces?! Who wrote this crap? I added a SPACE between the : and the ( to solve that little problem, but they should be eliminated from the Regex.

        modified on Friday, March 21, 2008 1:52 PM

        D Offline
        D Offline
        Dan Neely
        wrote on last edited by
        #9

        PIEBALDconsult wrote:

        Dagnabit! Frowny faces?! Who wrote this crap?

        Paging Chris Maunder. :doh:

        Otherwise [Microsoft is] toast in the long term no matter how much money they've got. They would be already if the Linux community didn't have it's head so firmly up it's own command line buffer that it looks like taking 15 years to find the desktop. -- Matthew Faithfull

        1 Reply Last reply
        0
        • P PIEBALDconsult

          This seems to work

          if ( args.Length > 0 )
          {
          //LIC#:ABC123 YRMD:03 MAKE:CHEV BTM :CP VIN :1G1JC12F137230800

          System.Text.RegularExpressions.Regex reg = new System.Text.RegularExpressions.Regex
          (
              @"^\\s\*LIC#\\s\*: (?'LIC'.\*)YRMD\\s\*: (?'YRMD'.\*)MAKE\\s\*: (?'MAKE'.\*)BTM\\s\*: (?'BTM'.\*)VIN\\s: (?'VIN'.\*)$"
          ) ;
          
          foreach ( System.Text.RegularExpressions.Match mat in reg.Matches ( args \[ 0 \] ) )
          {
              System.Console.WriteLine
              (
                  "LIC# = {0} YRMD = {1} MAKE = {2} BTM = {3} VIN = {4}"
              ,
                   mat.Groups \[ "LIC" \].Value
              ,
                   mat.Groups \[ "YRMD" \].Value
              ,
                   mat.Groups \[ "MAKE" \].Value
              ,
                   mat.Groups \[ "BTM" \].Value
              ,
                   mat.Groups \[ "VIN" \].Value
              ) ;
          }
          

          }

          Dagnabit! Frowny faces?! Who wrote this crap? I added a SPACE between the : and the ( to solve that little problem, but they should be eliminated from the Regex.

          modified on Friday, March 21, 2008 1:52 PM

          S Offline
          S Offline
          Steve Messer
          wrote on last edited by
          #10

          Hum, unless the copy paste messed something up this is not creating a match for me.

          P 1 Reply Last reply
          0
          • S Steve Messer

            Hum, unless the copy paste messed something up this is not creating a match for me.

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #11

            With the SPACE between the : and ( you need to use System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace to ignore the extraneous SPACEs, but then the # and everything after it become a comment! :mad: So now I've escaped the the # to \x23. Resulting in:

            System.Text.RegularExpressions.Regex reg = new System.Text.RegularExpressions.Regex
            (
            @"^\s*LIC\x23\s*: (?'LIC'.*)YRMD\s*: (?'YRMD'.*)MAKE\s*: (?'MAKE'.*)BTM\s*: (?'BTM'.*)VIN\s: (?'VIN'.*)$"
            ,
            System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace
            ) ;

            Whoops, I had left out an asterisk I had meant to include: VIN\s*****

            modified on Friday, March 21, 2008 2:35 PM

            C 1 Reply Last reply
            0
            • P PIEBALDconsult

              With the SPACE between the : and ( you need to use System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace to ignore the extraneous SPACEs, but then the # and everything after it become a comment! :mad: So now I've escaped the the # to \x23. Resulting in:

              System.Text.RegularExpressions.Regex reg = new System.Text.RegularExpressions.Regex
              (
              @"^\s*LIC\x23\s*: (?'LIC'.*)YRMD\s*: (?'YRMD'.*)MAKE\s*: (?'MAKE'.*)BTM\s*: (?'BTM'.*)VIN\s: (?'VIN'.*)$"
              ,
              System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace
              ) ;

              Whoops, I had left out an asterisk I had meant to include: VIN\s*****

              modified on Friday, March 21, 2008 2:35 PM

              C Offline
              C Offline
              ChrisKo 0
              wrote on last edited by
              #12

              Quick little edit to get rid of the extra space that was being captured.

              ^\s*LIC\x23\s*: (?'LIC'.*)\sYRMD\s*: (?'YRMD'.*)\sMAKE\s*: (?'MAKE'.*)\sBTM\s*: (?'BTM'.*)\sVIN\s: (?'VIN'.*)$

              P 1 Reply Last reply
              0
              • C ChrisKo 0

                Quick little edit to get rid of the extra space that was being captured.

                ^\s*LIC\x23\s*: (?'LIC'.*)\sYRMD\s*: (?'YRMD'.*)\sMAKE\s*: (?'MAKE'.*)\sBTM\s*: (?'BTM'.*)\sVIN\s: (?'VIN'.*)$

                P Offline
                P Offline
                PIEBALDconsult
                wrote on last edited by
                #13

                Hey, I was leaving that for the OP to do; I didn't want to solve the whole thing for him. :-D

                C 1 Reply Last reply
                0
                • P PIEBALDconsult

                  Hey, I was leaving that for the OP to do; I didn't want to solve the whole thing for him. :-D

                  C Offline
                  C Offline
                  ChrisKo 0
                  wrote on last edited by
                  #14

                  Sorry, I was bored and happened to have The Regulator open. At least now I can enjoy the weekend in knowing that I accomplished something today. :laugh:

                  S 1 Reply Last reply
                  0
                  • C ChrisKo 0

                    Sorry, I was bored and happened to have The Regulator open. At least now I can enjoy the weekend in knowing that I accomplished something today. :laugh:

                    S Offline
                    S Offline
                    Steve Messer
                    wrote on last edited by
                    #15

                    Thanks all, your comments and examples have been very inlightening

                    modified on Friday, March 21, 2008 6:15 PM

                    P 1 Reply Last reply
                    0
                    • S Steve Messer

                      Thanks all, your comments and examples have been very inlightening

                      modified on Friday, March 21, 2008 6:15 PM

                      P Offline
                      P Offline
                      PIEBALDconsult
                      wrote on last edited by
                      #16

                      Glad to be of service.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups