Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. .NET (Core and Framework)
  4. regex name/value pairs

regex name/value pairs

Scheduled Pinned Locked Moved .NET (Core and Framework)
regexhelpquestion
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    campk
    wrote on last edited by
    #1

    a little stumped with a regex i've been working on... trying to parse name/value pairs from a string - I have something close, but it chokes on certain cases. (?# PROPERTY )(?[a-zA-Z0-9_]*)\x20*?(?# OPERATOR )(?>=|<=|<|>|=|!=|LIKE)(?# Value )\x20*?'?\w+'? // any property name of any length, captured into backreference 'Property' (?[a-zA-Z0-9_]*) // whitespace, 0+, minimal matching \x20*? //any of the operators >=, <=, <, >, =, !=, LIKE captured into backreference 'Operator' (?>=|<=|<|>|=|!=|LIKE) // whitespace, 0+, minimal matching \x20*? // single quote, 0-1 // followed by word character, 1+ // followed by single quote, 0-1 '?\w+'? I am looking to match (and extract) name/operator/value pairs from a string, such as... PropertyName='SomeValue' AND IntProperty < 9 OR AnotherProperty LIKE 'this is a test' the regex i have above works fine for the first two terms, but then when you get a quoted string, it only matches up to the first space... This is only ever going to be an issue with the quoted strings, anything else will assume a word boundary on whitespace, which is the desired behavour. I need that last part of the Regex to basically say "if we're enclosed in single quotes, get anything between the opening and closing quote; otherwise, match everything up to whitespace" any help is much appreciated. (or if anything spots any weak spots in the Regex i have so far..) thanks -

    K M 2 Replies Last reply
    0
    • C campk

      a little stumped with a regex i've been working on... trying to parse name/value pairs from a string - I have something close, but it chokes on certain cases. (?# PROPERTY )(?[a-zA-Z0-9_]*)\x20*?(?# OPERATOR )(?>=|<=|<|>|=|!=|LIKE)(?# Value )\x20*?'?\w+'? // any property name of any length, captured into backreference 'Property' (?[a-zA-Z0-9_]*) // whitespace, 0+, minimal matching \x20*? //any of the operators >=, <=, <, >, =, !=, LIKE captured into backreference 'Operator' (?>=|<=|<|>|=|!=|LIKE) // whitespace, 0+, minimal matching \x20*? // single quote, 0-1 // followed by word character, 1+ // followed by single quote, 0-1 '?\w+'? I am looking to match (and extract) name/operator/value pairs from a string, such as... PropertyName='SomeValue' AND IntProperty < 9 OR AnotherProperty LIKE 'this is a test' the regex i have above works fine for the first two terms, but then when you get a quoted string, it only matches up to the first space... This is only ever going to be an issue with the quoted strings, anything else will assume a word boundary on whitespace, which is the desired behavour. I need that last part of the Regex to basically say "if we're enclosed in single quotes, get anything between the opening and closing quote; otherwise, match everything up to whitespace" any help is much appreciated. (or if anything spots any weak spots in the Regex i have so far..) thanks -

      K Offline
      K Offline
      Keith Malwitz
      wrote on last edited by
      #2

      You will need to test for quotes and assign the substring a name. Do this at the first place in your expression where the quotes can occur. This will test for optional single or double quotes: (?[""']?) -- *The double-quote is repeated as shown if it exists in a VB string. You must then use conditional matching to test whether has been assigned. The syntax for the conditional match is: (?yes|no) -- The |no portion is optional. So, (?\k) Will test whether was previously assigned, and if so it will match it again. Otherwise, it does nothing. Hope that helps.

      C 1 Reply Last reply
      0
      • K Keith Malwitz

        You will need to test for quotes and assign the substring a name. Do this at the first place in your expression where the quotes can occur. This will test for optional single or double quotes: (?[""']?) -- *The double-quote is repeated as shown if it exists in a VB string. You must then use conditional matching to test whether has been assigned. The syntax for the conditional match is: (?yes|no) -- The |no portion is optional. So, (?\k) Will test whether was previously assigned, and if so it will match it again. Otherwise, it does nothing. Hope that helps.

        C Offline
        C Offline
        campk
        wrote on last edited by
        #3

        Keith - I was headed in that direction, but couldn't quite get it.. Thanks for the help on it. What you have helped with has given me.. (?[a-zA-Z0-9_]*) \x20*? (?<Operator>>=|<=|<|>|=|!=|LIKE) \x20*? (?<quote>[""']?)\w+?\k<quote> \x20*? which matches the following PropertyName = 'Blah' AND PropertyTwo >= 9 OR PropertyThree = "asdf" OR AnotherProperty >= 'thisis atest' perfectly up until the last part... AnotherProperty='thisis atest' the \w+? will stop at the space in between 'is' and 'atest', which is where i am stuck now. using .*, or something similar captures too much... I am inclined to believe that using a negated character class [^""'] is the way to go, but am not positive how to do that; the things I have been trying wind up matching too much. Any ideas? Thanks - -- modified at 9:21 Friday 5th January, 2007

        K 1 Reply Last reply
        0
        • C campk

          a little stumped with a regex i've been working on... trying to parse name/value pairs from a string - I have something close, but it chokes on certain cases. (?# PROPERTY )(?[a-zA-Z0-9_]*)\x20*?(?# OPERATOR )(?>=|<=|<|>|=|!=|LIKE)(?# Value )\x20*?'?\w+'? // any property name of any length, captured into backreference 'Property' (?[a-zA-Z0-9_]*) // whitespace, 0+, minimal matching \x20*? //any of the operators >=, <=, <, >, =, !=, LIKE captured into backreference 'Operator' (?>=|<=|<|>|=|!=|LIKE) // whitespace, 0+, minimal matching \x20*? // single quote, 0-1 // followed by word character, 1+ // followed by single quote, 0-1 '?\w+'? I am looking to match (and extract) name/operator/value pairs from a string, such as... PropertyName='SomeValue' AND IntProperty < 9 OR AnotherProperty LIKE 'this is a test' the regex i have above works fine for the first two terms, but then when you get a quoted string, it only matches up to the first space... This is only ever going to be an issue with the quoted strings, anything else will assume a word boundary on whitespace, which is the desired behavour. I need that last part of the Regex to basically say "if we're enclosed in single quotes, get anything between the opening and closing quote; otherwise, match everything up to whitespace" any help is much appreciated. (or if anything spots any weak spots in the Regex i have so far..) thanks -

          M Offline
          M Offline
          Mike Dimmick
          wrote on last edited by
          #4

          You really have a grammar there, not just a regular expression. I'd recommend using something like ANTLR[^] to parse your expressions. It's a lot less of a headache than trying to do a single RE that does the whole job.

          Stability. What an interesting concept. -- Chris Maunder

          1 Reply Last reply
          0
          • C campk

            Keith - I was headed in that direction, but couldn't quite get it.. Thanks for the help on it. What you have helped with has given me.. (?[a-zA-Z0-9_]*) \x20*? (?<Operator>>=|<=|<|>|=|!=|LIKE) \x20*? (?<quote>[""']?)\w+?\k<quote> \x20*? which matches the following PropertyName = 'Blah' AND PropertyTwo >= 9 OR PropertyThree = "asdf" OR AnotherProperty >= 'thisis atest' perfectly up until the last part... AnotherProperty='thisis atest' the \w+? will stop at the space in between 'is' and 'atest', which is where i am stuck now. using .*, or something similar captures too much... I am inclined to believe that using a negated character class [^""'] is the way to go, but am not positive how to do that; the things I have been trying wind up matching too much. Any ideas? Thanks - -- modified at 9:21 Friday 5th January, 2007

            K Offline
            K Offline
            Keith Malwitz
            wrote on last edited by
            #5

            Try replacing the \w+?\k with: .+\k This will match everything between the quotes. The \w is matching any word character, so it's not matching the spaces. The period will match anything, and since we want to get everything between the quotes, we use the + to denote one or more matches. Hope that helps.

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups