Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Who is afraid of regex?

Who is afraid of regex?

Scheduled Pinned Locked Moved The Lounge
regexcssfunctionalquestion
42 Posts 20 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    honey the codewitch
    wrote on last edited by
    #1

    (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

    Real programmers use butterflies

    Kornfeld Eliyahu PeterK A P R M 16 Replies Last reply
    0
    • H honey the codewitch

      (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

      Real programmers use butterflies

      Kornfeld Eliyahu PeterK Offline
      Kornfeld Eliyahu PeterK Offline
      Kornfeld Eliyahu Peter
      wrote on last edited by
      #2

      Me think that most of us do not like it, because of bad experience of implementations in the past by different environments... Almost all of my validations are done in regex, but never done too much of code generations to really use it there...

      "The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012

      "It never ceases to amaze me that a spacecraft launched in 1977 can be fixed remotely from Earth." ― Brian Cox

      H G 2 Replies Last reply
      0
      • H honey the codewitch

        (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

        Real programmers use butterflies

        A Offline
        A Offline
        Amarnath S
        wrote on last edited by
        #3

        Have used some regexes in my article Translitera - Phonetic Typing in Some Indian Languages[^], which is a tool for transliterating from English to some Indian languages. The program uses regexes to identify patterns in each word, and hence split each word into manageable parts. Of course, there are some situations which are not handled, there is always scope for improvement.

        1 Reply Last reply
        0
        • H honey the codewitch

          (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

          Real programmers use butterflies

          P Offline
          P Offline
          Peter_in_2780
          wrote on last edited by
          #4

          I use them for input validation. Things like dates and times are straightforward. Names are not! Even imposing cultural restrictions (two capitalised names and some fussing around the edges). Patrick O'Reilly-Smythe and Ian McDonald are about as complex as I allowed for members of our Rural Fire Brigade. I can't remember whether Giulio d'Angelo would pass or fail. If he joins up, I'll revisit the code. ;P My other major use is in (often throwaway) SED scripts, or of course grep. Things like extracting the word after Invalid user in security logs. Another one I used recently was to reconstruct words that were hyphenated across lines in a OCR'd manual. Google translate barfs on the fragments of hyphenated words. Cheers, Peter

          Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012

          1 Reply Last reply
          0
          • Kornfeld Eliyahu PeterK Kornfeld Eliyahu Peter

            Me think that most of us do not like it, because of bad experience of implementations in the past by different environments... Almost all of my validations are done in regex, but never done too much of code generations to really use it there...

            "The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012

            H Offline
            H Offline
            honey the codewitch
            wrote on last edited by
            #5

            If you do it right, regex facilitates rather than hinders code generation but Microsoft's engine is unfortunately limited in that regard. It does code generation, but it doesn't generate C# code for example. It could. It just doesn't. Like I said in the OP, a regex is a state machine is a regex. a state machine is code. it's code all the way down. :) ETA: If you stick to the basic operations and common syntactic sugar and avoid backtracking and other nonsense, most of the regex stuff is the same regardless of implementation.

            Real programmers use butterflies

            1 Reply Last reply
            0
            • H honey the codewitch

              (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

              Real programmers use butterflies

              R Offline
              R Offline
              RickZeeland
              wrote on last edited by
              #6

              As I don't use RegEx very often, I mostly use some online tool for help. Here you can find some: best-regex-testing-tools[^] I also like this CodeProject article: I don't like Regex...[^] :-\

              1 Reply Last reply
              0
              • H honey the codewitch

                (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                Real programmers use butterflies

                M Offline
                M Offline
                Mike Hankey
                wrote on last edited by
                #7

                Regex is black magic and sacrifices of caffeine and pizza offered in copious amounts is the only way to appease the beast.

                I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                H 1 Reply Last reply
                0
                • H honey the codewitch

                  (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                  Real programmers use butterflies

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  I would rather use whatever language I am working with to perform the parse. As you just stated... regex is technically another small programming language. I am not sure if you know this... but you can take a regular expression and use the Ragel state machine compiler[^] to convert it to C/C++, D, Go, Java, Ruby and even Objective-C. Interestingly... I do not see C# support. Ragel Cheat Sheet[^] Someone should probably write a little Visual Studio addon that takes a regular expression and converts it to a C# state machine... as it seems .NET programmers use alot of regex. Best Wishes, -David Delaune

                  C H 2 Replies Last reply
                  0
                  • H honey the codewitch

                    (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                    Real programmers use butterflies

                    S Offline
                    S Offline
                    Slacker007
                    wrote on last edited by
                    #9

                    regex is a tool. regex is great when used correctly. read the owner's manual, as with any tool. no one should be afraid of using a tool to complete a specific task.

                    1 Reply Last reply
                    0
                    • H honey the codewitch

                      (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                      Real programmers use butterflies

                      C Offline
                      C Offline
                      CPallini
                      wrote on last edited by
                      #10

                      I am afraid of you, messing up with regex. :-D

                      Quote:

                      Or you can use them to generate code for state machines

                      That's indeed intriguing.

                      "In testa che avete, Signor di Ceprano?" -- Rigoletto

                      H 1 Reply Last reply
                      0
                      • L Lost User

                        I would rather use whatever language I am working with to perform the parse. As you just stated... regex is technically another small programming language. I am not sure if you know this... but you can take a regular expression and use the Ragel state machine compiler[^] to convert it to C/C++, D, Go, Java, Ruby and even Objective-C. Interestingly... I do not see C# support. Ragel Cheat Sheet[^] Someone should probably write a little Visual Studio addon that takes a regular expression and converts it to a C# state machine... as it seems .NET programmers use alot of regex. Best Wishes, -David Delaune

                        C Offline
                        C Offline
                        CPallini
                        wrote on last edited by
                        #11

                        Quote:

                        I am not sure if you know this...

                        I didn't. Thank you for posting it.

                        "In testa che avete, Signor di Ceprano?" -- Rigoletto

                        1 Reply Last reply
                        0
                        • C CPallini

                          I am afraid of you, messing up with regex. :-D

                          Quote:

                          Or you can use them to generate code for state machines

                          That's indeed intriguing.

                          "In testa che avete, Signor di Ceprano?" -- Rigoletto

                          H Offline
                          H Offline
                          honey the codewitch
                          wrote on last edited by
                          #12

                          I wrote an article (which people hated) where I tried to impress that upon the reader. Fun With State Machines: Incrementally Parsing Numbers Using Hacked Regex[^] It centered around this expression for a JSON number: (\-?)(0|[1-9][0-9]*)((\.[0-9]+)?([Ee][\+\-]?[0-9]+)?) Which generates this state graph: State Graph[^] Which I then use to generate a state machine.

                          Real programmers use butterflies

                          1 Reply Last reply
                          0
                          • L Lost User

                            I would rather use whatever language I am working with to perform the parse. As you just stated... regex is technically another small programming language. I am not sure if you know this... but you can take a regular expression and use the Ragel state machine compiler[^] to convert it to C/C++, D, Go, Java, Ruby and even Objective-C. Interestingly... I do not see C# support. Ragel Cheat Sheet[^] Someone should probably write a little Visual Studio addon that takes a regular expression and converts it to a C# state machine... as it seems .NET programmers use alot of regex. Best Wishes, -David Delaune

                            H Offline
                            H Offline
                            honey the codewitch
                            wrote on last edited by
                            #13

                            I could do that, since I already wrote several apps that do exactly that. The issue is it doesn't support backtracking because i don't like backtracking regex.

                            Real programmers use butterflies

                            1 Reply Last reply
                            0
                            • M Mike Hankey

                              Regex is black magic and sacrifices of caffeine and pizza offered in copious amounts is the only way to appease the beast.

                              I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                              H Offline
                              H Offline
                              honey the codewitch
                              wrote on last edited by
                              #14

                              But it's such useful, compelling black magic! ;P

                              Real programmers use butterflies

                              M 1 Reply Last reply
                              0
                              • H honey the codewitch

                                (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                                Real programmers use butterflies

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #15

                                Reminds me of APL. The language of 80's modelling gods. Cryptic enough they had their own department. At least in assembler we had a few letters for op codes. Afraid? It's just too forgettable (for me).

                                It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

                                H 1 Reply Last reply
                                0
                                • H honey the codewitch

                                  (I'm ignoring backtracking regex here because it's dirty, and algorithmically less useful except for making it easier for the user to match text) Anyway it's just a tiny functional programming language with only ()|?* 4 explicit operators and 1 implicit one. Representing the regex programming language as code: Any regex is mathematically equivelent to the DFA state machine it represents, and can be converted algorithmically back and forth to and from a state machine and a regular expression. Perfect compilation/decompilation. So you can use them to match text (boring!) Or you can use them to generate code for state machines (less boring!) And yet I've met a lot of programmers that either loathe them, are intimidated by them, or both. They're wonderful little things, with interesting mathematical properties, but more importantly, they're useful for everything quick and dirty.

                                  Real programmers use butterflies

                                  C Offline
                                  C Offline
                                  Chris Maunder
                                  wrote on last edited by
                                  #16

                                  By Law we need to quote this XKCD. xkcd: Regular Expressions[^] I used to have this T-shirt, too: Regular Expressions[^] Personally regular expressions are my indulgent cheat. Kinda like having pizza. I know I should go easy on them, and I'm trying to give them up, but when they are good, they are sooo good.

                                  cheers Chris Maunder

                                  H T 2 Replies Last reply
                                  0
                                  • L Lost User

                                    Reminds me of APL. The language of 80's modelling gods. Cryptic enough they had their own department. At least in assembler we had a few letters for op codes. Afraid? It's just too forgettable (for me).

                                    It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

                                    H Offline
                                    H Offline
                                    honey the codewitch
                                    wrote on last edited by
                                    #17

                                    I built a regex DOM for people like you. :suss: :laugh:

                                    Real programmers use butterflies

                                    1 Reply Last reply
                                    0
                                    • H honey the codewitch

                                      But it's such useful, compelling black magic! ;P

                                      Real programmers use butterflies

                                      M Offline
                                      M Offline
                                      Mike Hankey
                                      wrote on last edited by
                                      #18

                                      That it is, I've used Expresso Regular Expression Tool[^] for years. It helps but I still can't wrap my head around it. That and dark matter... :)

                                      I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                                      H G 2 Replies Last reply
                                      0
                                      • C Chris Maunder

                                        By Law we need to quote this XKCD. xkcd: Regular Expressions[^] I used to have this T-shirt, too: Regular Expressions[^] Personally regular expressions are my indulgent cheat. Kinda like having pizza. I know I should go easy on them, and I'm trying to give them up, but when they are good, they are sooo good.

                                        cheers Chris Maunder

                                        H Offline
                                        H Offline
                                        honey the codewitch
                                        wrote on last edited by
                                        #19

                                        I would wear that shirt but then people would ask me what it meant and if I told them they would ask me to fix their computers. Apparently I have no impulse control because I even use regular expressions for things they were never intended for. :-\

                                        Real programmers use butterflies

                                        1 Reply Last reply
                                        0
                                        • M Mike Hankey

                                          That it is, I've used Expresso Regular Expression Tool[^] for years. It helps but I still can't wrap my head around it. That and dark matter... :)

                                          I'm not sure how many cookies it makes to be happy, but so far it's not 27. JaxCoder.com

                                          H Offline
                                          H Offline
                                          honey the codewitch
                                          wrote on last edited by
                                          #20

                                          Forget backtracking regular expressions, as they don't have the same fancy mathematical properties as their non-backtracking counterparts. Use the non-backtracking operators and there's only 5 operations to remember, concatenation, alternation, parentheses, zero or one match and kleene star (looping * - zero or more match), and concatenation is implicit. They are 1. Simpler to understand 2. Faster to execute 3. Weirdly mathy but in a cool way 4. The same across almost all regular expression engines I give a primer at the end of this article. I taught them to my computer, and trust me - it's not very smart, but then I also taught it C in that article. Fun With State Machines: Incrementally Parsing Numbers Using Hacked Regex[^]

                                          Real programmers use butterflies

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups