Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Who is afraid of regex?

Who is afraid of regex?

Scheduled Pinned Locked Moved The Lounge
regexcssfunctionalquestion
42 Posts 20 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Mike Hankey

    Regex is black magic and sacrifices of caffeine and pizza offered in copious amounts is the only way to appease the beast.

    I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

    H Offline
    H Offline
    honey the codewitch
    wrote on last edited by
    #14

    But it's such useful, compelling black magic! ;P

    Real programmers use butterflies

    M 1 Reply Last reply
    0
    • H honey the codewitch

      (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

      Real programmers use butterflies

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #15

      Reminds me of APL. The language of 80's modelling gods. Cryptic enough they had their own department. At least in assembler we had a few letters for op codes. Afraid? It's just too forgettable (for me).

      It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

      H 1 Reply Last reply
      0
      • L Lost User

        Reminds me of APL. The language of 80's modelling gods. Cryptic enough they had their own department. At least in assembler we had a few letters for op codes. Afraid? It's just too forgettable (for me).

        It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

        H Offline
        H Offline
        honey the codewitch
        wrote on last edited by
        #16

        I built a regex DOM for people like you. :suss: :laugh:

        Real programmers use butterflies

        1 Reply Last reply
        0
        • H honey the codewitch

          (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

          Real programmers use butterflies

          C Offline
          C Offline
          Chris Maunder
          wrote on last edited by
          #17

          By Law we need to quote this XKCD. xkcd: Regular Expressions[^] I used to have this T-shirt, too: Regular Expressions[^] Personally regular expressions are my indulgent cheat. Kinda like having pizza. I know I should go easy on them, and I'm trying to give them up, but when they are good, they are sooo good.

          cheers Chris Maunder

          H T 2 Replies Last reply
          0
          • H honey the codewitch

            But it's such useful, compelling black magic! ;P

            Real programmers use butterflies

            M Offline
            M Offline
            Mike Hankey
            wrote on last edited by
            #18

            That it is, I've used Expresso Regular Expression Tool[^] for years. It helps but I still can't wrap my head around it. That and dark matter... :)

            I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

            H G 2 Replies Last reply
            0
            • C Chris Maunder

              By Law we need to quote this XKCD. xkcd: Regular Expressions[^] I used to have this T-shirt, too: Regular Expressions[^] Personally regular expressions are my indulgent cheat. Kinda like having pizza. I know I should go easy on them, and I'm trying to give them up, but when they are good, they are sooo good.

              cheers Chris Maunder

              H Offline
              H Offline
              honey the codewitch
              wrote on last edited by
              #19

              I would wear that shirt but then people would ask me what it meant and if I told them they would ask me to fix their computers. Apparently I have no impulse control because I even use regular expressions for things they were never intended for. :-\

              Real programmers use butterflies

              1 Reply Last reply
              0
              • M Mike Hankey

                That it is, I've used Expresso Regular Expression Tool[^] for years. It helps but I still can't wrap my head around it. That and dark matter... :)

                I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #20

                Forget backtracking regular expressions, as they don't have the same fancy mathematical properties as their non-backtracking counterparts. Use the non-backtracking operators and there's only 5 operations to remember, concatenation, alternation, parentheses, zero or one match and kleene star (looping * - zero or more match), and concatenation is implicit. They are 1. Simpler to understand 2. Faster to execute 3. Weirdly mathy but in a cool way 4. The same across almost all regular expression engines I give a primer at the end of this article. I taught them to my computer, and trust me - it's not very smart, but then I also taught it C in that article. Fun With State Machines: Incrementally Parsing Numbers Using Hacked Regex[^]

                Real programmers use butterflies

                1 Reply Last reply
                0
                • C Chris Maunder

                  By Law we need to quote this XKCD. xkcd: Regular Expressions[^] I used to have this T-shirt, too: Regular Expressions[^] Personally regular expressions are my indulgent cheat. Kinda like having pizza. I know I should go easy on them, and I'm trying to give them up, but when they are good, they are sooo good.

                  cheers Chris Maunder

                  T Offline
                  T Offline
                  trønderen
                  wrote on last edited by
                  #21

                  Chris Maunder wrote:

                  By Law we need to quote this XKCD. xkcd: Regular Expressions[^]

                  I love the popup text of that one!

                  1 Reply Last reply
                  0
                  • H honey the codewitch

                    (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                    Real programmers use butterflies

                    T Offline
                    T Offline
                    trønderen
                    wrote on last edited by
                    #22

                    The only regex(-like) syntax I felt somewhat comfortable working with was SNOBOL :-) That was 30+ years ago. I first met it as a 200 source lines version of Eliza, the therapist, which fascinated me immensely. Obviusly, that version never passed any Turing test, yet: Try to write anything comparable in 200 lines of any ordinary, algorithmic language! So I started playing around with it, just for fun - I never used it commercially. Actually, not too long ago I picked up the source code of an old SNOBOL interpreter, hoping one day to port it. It is currently #43 on my project lists. Tuits are hard to find nowadays, especially round ones.

                    1 Reply Last reply
                    0
                    • M Mike Hankey

                      That it is, I've used Expresso Regular Expression Tool[^] for years. It helps but I still can't wrap my head around it. That and dark matter... :)

                      I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                      G Offline
                      G Offline
                      Gary R Wheeler
                      wrote on last edited by
                      #23

                      Mike Hankey wrote:

                      dark matter

                      Dark Matter[^]; great series. A shame it only went three seasons.

                      Software Zen: delete this;

                      M H 2 Replies Last reply
                      0
                      • H honey the codewitch

                        (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                        Real programmers use butterflies

                        G Offline
                        G Offline
                        Gary R Wheeler
                        wrote on last edited by
                        #24

                        I like using regex for day-to-day, throwaway things. It's especially good for reformatting text. I'm certainly not intimidated by them. That said, I don't think I would ever use one in product code with a long life-span. You must admit that regular expressions tend to be write-only, which is a cardinal sin against those who must maintain the code, including your future selves. Code written very concisely, and regular expressions may be the ultimate in concise, require a lot of mental unpacking during maintenance. Unless you write a ridiculous amount of comments for the expression, it might not be worth it.

                        Software Zen: delete this;

                        H S 2 Replies Last reply
                        0
                        • G Gary R Wheeler

                          I like using regex for day-to-day, throwaway things. It's especially good for reformatting text. I'm certainly not intimidated by them. That said, I don't think I would ever use one in product code with a long life-span. You must admit that regular expressions tend to be write-only, which is a cardinal sin against those who must maintain the code, including your future selves. Code written very concisely, and regular expressions may be the ultimate in concise, require a lot of mental unpacking during maintenance. Unless you write a ridiculous amount of comments for the expression, it might not be worth it.

                          Software Zen: delete this;

                          H Offline
                          H Offline
                          honey the codewitch
                          wrote on last edited by
                          #25

                          I think it depends. I generally agree that complicated regex is mug's game. However, How do you technically, and accurately convey a set of rules around lexical requirements? Such rules must be able to be conveyed to other developers precisely. Such rules must be unambiguous, and testable. Such rules must be absorbable in reasonable amount of time, meaning no poring over RFCs if one can avoid it. Imagine conveying the rules for what constitutes a JSON number You can either say:

                          (\-?)(0|[1-9][0-9]*)((\.[0-9]+)?([Ee][\+\-]?[0-9]+)?)

                          Which takes some unpacking as you say, but is certainly readable. Or I can give you a page long document of requirements around JSON number parsing. Personally, I can read that quite easily, but that's me. Let me propose something - there is a meaningful subset of regular expressions which are easy to understand, and can fulfill most simple lexical specifications like the above, or say, like an email address, or an url, or any number of small, structured text fragments. It beats the alternative, hands down.

                          Real programmers use butterflies

                          G 1 Reply Last reply
                          0
                          • H honey the codewitch

                            (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                            Real programmers use butterflies

                            R Offline
                            R Offline
                            realJSOP
                            wrote on last edited by
                            #26

                            I make what could be regarded as "heroic effort" to avoid using regex whenever possible. However, I used it in a recent application because it was the most expedient way to do what I needed.

                            ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                            -----
                            You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                            -----
                            When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                            H 1 Reply Last reply
                            0
                            • H honey the codewitch

                              I think it depends. I generally agree that complicated regex is mug's game. However, How do you technically, and accurately convey a set of rules around lexical requirements? Such rules must be able to be conveyed to other developers precisely. Such rules must be unambiguous, and testable. Such rules must be absorbable in reasonable amount of time, meaning no poring over RFCs if one can avoid it. Imagine conveying the rules for what constitutes a JSON number You can either say:

                              (\-?)(0|[1-9][0-9]*)((\.[0-9]+)?([Ee][\+\-]?[0-9]+)?)

                              Which takes some unpacking as you say, but is certainly readable. Or I can give you a page long document of requirements around JSON number parsing. Personally, I can read that quite easily, but that's me. Let me propose something - there is a meaningful subset of regular expressions which are easy to understand, and can fulfill most simple lexical specifications like the above, or say, like an email address, or an url, or any number of small, structured text fragments. It beats the alternative, hands down.

                              Real programmers use butterflies

                              G Offline
                              G Offline
                              Gary R Wheeler
                              wrote on last edited by
                              #27

                              To my mind that regex would be okay. It's the thousands of characters, wall-of-text abominations that I object to. I know, that's an example of poor use of regex, but it's the kind of thing you find. Inexperienced folks start using it, and all of a sudden it becomes their favorite toy. A toy that's all sharp edges...

                              Software Zen: delete this;

                              H 2 Replies Last reply
                              0
                              • G Gary R Wheeler

                                I like using regex for day-to-day, throwaway things. It's especially good for reformatting text. I'm certainly not intimidated by them. That said, I don't think I would ever use one in product code with a long life-span. You must admit that regular expressions tend to be write-only, which is a cardinal sin against those who must maintain the code, including your future selves. Code written very concisely, and regular expressions may be the ultimate in concise, require a lot of mental unpacking during maintenance. Unless you write a ridiculous amount of comments for the expression, it might not be worth it.

                                Software Zen: delete this;

                                S Offline
                                S Offline
                                Slacker007
                                wrote on last edited by
                                #28

                                we use regex expressions in production code all the time. almost always validation. phone numbers, emails, web addresses, and other pattern specialized validation. never had any issues. performance has never been an issue and accuracy has never been an issue. most of the uses require zero maintenance after implementation.

                                R 1 Reply Last reply
                                0
                                • G Gary R Wheeler

                                  To my mind that regex would be okay. It's the thousands of characters, wall-of-text abominations that I object to. I know, that's an example of poor use of regex, but it's the kind of thing you find. Inexperienced folks start using it, and all of a sudden it becomes their favorite toy. A toy that's all sharp edges...

                                  Software Zen: delete this;

                                  H Offline
                                  H Offline
                                  honey the codewitch
                                  wrote on last edited by
                                  #29

                                  :laugh: I won't argue with that.

                                  Real programmers use butterflies

                                  1 Reply Last reply
                                  0
                                  • R realJSOP

                                    I make what could be regarded as "heroic effort" to avoid using regex whenever possible. However, I used it in a recent application because it was the most expedient way to do what I needed.

                                    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                    -----
                                    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                    -----
                                    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                    H Offline
                                    H Offline
                                    honey the codewitch
                                    wrote on last edited by
                                    #30

                                    Expedience is how it gets you. Next thing you know, you're hooked. :laugh:

                                    Real programmers use butterflies

                                    R 1 Reply Last reply
                                    0
                                    • H honey the codewitch

                                      Expedience is how it gets you. Next thing you know, you're hooked. :laugh:

                                      Real programmers use butterflies

                                      R Offline
                                      R Offline
                                      realJSOP
                                      wrote on last edited by
                                      #31

                                      I’ve resisted being hooked on its expedience since it’s inception. The way I see it, I will be able to easily make it to retirement (three years) without getting hooked on it.

                                      ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                      -----
                                      You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                      -----
                                      When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                      H 1 Reply Last reply
                                      0
                                      • G Gary R Wheeler

                                        To my mind that regex would be okay. It's the thousands of characters, wall-of-text abominations that I object to. I know, that's an example of poor use of regex, but it's the kind of thing you find. Inexperienced folks start using it, and all of a sudden it becomes their favorite toy. A toy that's all sharp edges...

                                        Software Zen: delete this;

                                        H Offline
                                        H Offline
                                        honey the codewitch
                                        wrote on last edited by
                                        #32

                                        I feel like if I were in charge of development practices at a given shop and regex were used as documentation as i suggested strict limits would be placed on its use For starters [^-]().+*?| is all you get. That keeps it simple, portable, and non-backtracking. You can easily generate flow diagrams from it. And it keeps people from getting ... "creative"

                                        Real programmers use butterflies

                                        G 1 Reply Last reply
                                        0
                                        • R realJSOP

                                          I’ve resisted being hooked on its expedience since it’s inception. The way I see it, I will be able to easily make it to retirement (three years) without getting hooked on it.

                                          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                          -----
                                          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                          -----
                                          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                          H Offline
                                          H Offline
                                          honey the codewitch
                                          wrote on last edited by
                                          #33

                                          I don't know that I'll live long enough to retire from coding, though i may retire from doing it professionally. I've always loved it. I suspect I always will. :)

                                          Real programmers use butterflies

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups