Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I love regular expressions

I love regular expressions

Scheduled Pinned Locked Moved The Lounge
designcomgraphicsiot
83 Posts 36 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H honey the codewitch

    A modest proposal: Learn the DFA subset. Commit it to memory, and forget the rest. DFA is the non-backtracking subset of regular expressions () - capture and group [] - match char ranges * - match zero or more + - match one or more ? - match zero or one . - match any single character | - match a or b (a|b)

    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

    J Offline
    J Offline
    jmaida
    wrote on last edited by
    #63

    Agree. A good and practical recommendation!

    "A little time, a little trouble, your better day" Badfinger

    1 Reply Last reply
    0
    • G GuyThiebaut

      I used ChatGPT precisely for that and it returned a decent regex with an explanation. I needed to word my question in a manner that was generic but the result was actually helpful.

      “That which can be asserted without evidence, can be dismissed without evidence.”

      ― Christopher Hitchens

      J Offline
      J Offline
      jmaida
      wrote on last edited by
      #64

      cool. forgot chatgpt is hanging out there

      "A little time, a little trouble, your better day" Badfinger

      G 1 Reply Last reply
      0
      • K k5054

        Not keeping your tool to yourself is one of the leading causes of dismissal, too.

        "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

        J Offline
        J Offline
        jmaida
        wrote on last edited by
        #65

        lordy... "its a bad worker who blames their tools." "there is no tool like an old tool." :)

        "A little time, a little trouble, your better day" Badfinger

        1 Reply Last reply
        0
        • D David ONeil

          How the hell would someone know that "[0-9]{1,3}" is enough to find sequential numbers in Microsoft Word? How the hell did I find that magic? Every regular expression seems to require a convoluted google search. Crazy world...

          Our Forgotten Astronomy | Object Oriented Programming with C++ | Wordle solver

          J Offline
          J Offline
          jmaida
          wrote on last edited by
          #66

          yikes, i forgot how complex regex can get. gives one a headache trying to parse them, much less compose one. very worthy topic though.

          "A little time, a little trouble, your better day" Badfinger

          1 Reply Last reply
          0
          • H honey the codewitch

            A modest proposal: Learn the DFA subset. Commit it to memory, and forget the rest. DFA is the non-backtracking subset of regular expressions () - capture and group [] - match char ranges * - match zero or more + - match one or more ? - match zero or one . - match any single character | - match a or b (a|b)

            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

            K Offline
            K Offline
            k5054
            wrote on last edited by
            #67

            would not ? match zero or one not be part of that? or is that a posix extension?

            "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

            H 1 Reply Last reply
            0
            • J jmaida

              cool. forgot chatgpt is hanging out there

              "A little time, a little trouble, your better day" Badfinger

              G Offline
              G Offline
              GuyThiebaut
              wrote on last edited by
              #68

              I had exactly the dilemna you had - I was looking for a reverse regular expression parser i.e. create a regular expression from a desired result and an initial string and I thought I would give ChatGPT a go.

              “That which can be asserted without evidence, can be dismissed without evidence.”

              ― Christopher Hitchens

              1 Reply Last reply
              0
              • K k5054

                would not ? match zero or one not be part of that? or is that a posix extension?

                "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #69

                You're right. Totally spaced that I edited.

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                1 Reply Last reply
                0
                • H honey the codewitch

                  jschell wrote:

                  Not sure what you mean by that.

                  What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it. And yeah, that encompasses many different engines, which themselves are what run on a particular platform, unless you're doing code generation, which I do sometimes for them so I don't have to include the regex engine in my firmware. That code is easy to make cross platform. You'd almost have to put in extra effort to make it otherwise. :) I was maybe trying to be too brief by half. I assumed the meaning would come through, but I guess not.

                  jschell wrote:

                  Perhaps you were referring to that the simplest syntax works in different engines though

                  In part yes, but also, virtually every platform has a DFA regex engine for it, or alternatively you can generate DFA code for that platform, with something such as my rxcg project. I was intending to imply that as well.

                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                  J Offline
                  J Offline
                  jschell
                  wrote on last edited by
                  #70

                  honey the codewitch wrote:

                  What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it.

                  If you find a programming system that doesn't allow that then you might prepare for the universe to end. Far as I can recall it would be mathematically impossible for that not to be true.

                  H 1 Reply Last reply
                  0
                  • H honey the codewitch

                    I recall this too. Qwerty I heard was in part laid out to slow typists down so the mechanical typewriter could keep up. I heard it from the Beagle Bros back in the 1980s so I don't know how true it is. Either way, presumably eventually that wasn't an issue anymore. And Devorak was a common alternative, or at least common as qwerty alternatives go. It was touted as better, but despite the hype I remember reading that it didn't actually improve people's WPM.

                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                    J Offline
                    J Offline
                    jschell
                    wrote on last edited by
                    #71

                    honey the codewitch wrote:

                    out to slow typists down so the mechanical typewriter could keep up

                    Not exactly. Rather the layout exists to allow typing to be faster. QWERTY - Wikipedia[^] "but rather to speed up typing. Indeed, there is evidence that, aside from the issue of jamming, placing often-used keys farther apart increases typing speed, because it encourages alternation between the hands."

                    honey the codewitch wrote:

                    but despite the hype

                    I believe that was the one where the data was faked.

                    1 Reply Last reply
                    0
                    • J jschell

                      honey the codewitch wrote:

                      What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it.

                      If you find a programming system that doesn't allow that then you might prepare for the universe to end. Far as I can recall it would be mathematically impossible for that not to be true.

                      H Offline
                      H Offline
                      honey the codewitch
                      wrote on last edited by
                      #72

                      jschell wrote:

                      If you find a programming system that doesn't allow that then you might prepare for the universe to end

                      Maybe I misunderstand you, but if you're speaking in the general sense, you aren't going to run a garbage collected system for example, on an 8-bit platform with 4KB of RAM, hopefully. Even if you could, it wouldn't be practical for anything. A DFA on the other hand will run handily there.

                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                      J 1 Reply Last reply
                      0
                      • H honey the codewitch

                        jschell wrote:

                        If you find a programming system that doesn't allow that then you might prepare for the universe to end

                        Maybe I misunderstand you, but if you're speaking in the general sense, you aren't going to run a garbage collected system for example, on an 8-bit platform with 4KB of RAM, hopefully. Even if you could, it wouldn't be practical for anything. A DFA on the other hand will run handily there.

                        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                        J Offline
                        J Offline
                        jschell
                        wrote on last edited by
                        #73

                        honey the codewitch wrote:

                        A DFA on the other hand will run handily there.

                        Reverse that though. What system, which has resources to run anything that is non-trivial, will not run a DFA?

                        H 1 Reply Last reply
                        0
                        • B BernardIE5317

                          i enjoy the challenge of learning / writing regular expressions for my text editing of source code . as in all things mastering it is the same as being invited to perform at Carnegie Hall id est "practice practice practice" . my purpose in this post though is to express my impression from these many and expert posts utilizing fancy schmancy terms of which i do not know which seem to indicate regular expressions are more powerful / advanced / sophisticated than i know as i understand them to be nothing more than a text editing convenience . may i please inquire am i wrong in this regard . thank you kindly .

                          J Offline
                          J Offline
                          jschell
                          wrote on last edited by
                          #74

                          If you like to read technical stuff then "Mastering Regular Expressions" by Friedl (yes spelled like that.) Not only interesting but a bit scary since it provides examples that will shut down your system.

                          B 1 Reply Last reply
                          0
                          • J jschell

                            honey the codewitch wrote:

                            A DFA on the other hand will run handily there.

                            Reverse that though. What system, which has resources to run anything that is non-trivial, will not run a DFA?

                            H Offline
                            H Offline
                            honey the codewitch
                            wrote on last edited by
                            #75

                            I can't think of anything that couldn't run a DFA. :confused: It's such a basic Turing-esque construction.

                            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                            J 1 Reply Last reply
                            0
                            • K k5054

                              Richard Attenborough : Actor, Jurasic Park, The Great Escape ... David Attenbourogh: Broadcaster and Biologist They are brothers, though. But maybe you meant Richard?

                              "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                              M Offline
                              M Offline
                              MarkTJohnson
                              wrote on last edited by
                              #76

                              Yeah, I meant the one who narrates nature programs.

                              I’ve given up trying to be calm. However, I am open to feeling slightly less agitated. I’m begging you for the benefit of everyone, don’t be STUPID.

                              1 Reply Last reply
                              0
                              • J jmaida

                                need regex to natural language and vice versa

                                "A little time, a little trouble, your better day" Badfinger

                                T Offline
                                T Offline
                                TNCaver
                                wrote on last edited by
                                #77

                                This alleged AI-powered generator got me close to what I needed today: https://www.regexgo.com/[^] And this site was a great help in troubleshooting and refining the AI's results: https://regex101.com/[^] Bonus: I still don't understand RegEx.

                                There are no solutions, only trade-offs.
                                   - Thomas Sowell

                                A day can really slip by when you're deliberately avoiding what you're supposed to do.
                                   - Calvin (Bill Watterson, Calvin & Hobbes)

                                1 Reply Last reply
                                0
                                • S seismofish

                                  I'm absolutely with you on that. Also, with PCREs, the /x switch allows you to indent and comment to your heart's content, so you can write perfectly legible code, and there are on-line engines where you can drop your expression and your input and watch step by step while it does its magic. I'm not sure that I do a day's work without writing a regex and I know of no tool with anywhere near the power for parsing text. ~~~~~~~~ <°}}}>«<

                                  B Offline
                                  B Offline
                                  BernardIE5317
                                  wrote on last edited by
                                  #78

                                  what about AWK ?

                                  1 Reply Last reply
                                  0
                                  • L Lost User

                                    I blank out on regular expressions. Too much like learning a new language. The other day, I needed to get 0 or more leading characters from a string (as an int). I Googled (regex), I went, I left. Coding challenge: get leading digits (I settled for LINQ)

                                    var text = "123rd NY 2nd Battalion".
                                    //
                                    var count = text.TakeWhile( c => Char.IsDigit( c ) ).Count();
                                    int i = count > 0 ? ConvertToInt32( text.SubString( 0, count) ) : 0;

                                    Answer: 123

                                    "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                    B Offline
                                    B Offline
                                    BernardIE5317
                                    wrote on last edited by
                                    #79

                                    i requested ChatGBT "please write a regular expression which identifies the numerical digits beginning a text ." its response below . ^\d+ is it wrong ? was my request incorrect ?

                                    B 1 Reply Last reply
                                    0
                                    • J jschell

                                      If you like to read technical stuff then "Mastering Regular Expressions" by Friedl (yes spelled like that.) Not only interesting but a bit scary since it provides examples that will shut down your system.

                                      B Offline
                                      B Offline
                                      BernardIE5317
                                      wrote on last edited by
                                      #80

                                      thank you for the suggestion . intrguing . too bad i am a cheapskate . as for advanced uses i now am reminded of its use in determining if a number is prime . Demystifying The Regular Expression That Checks If A Number Is Prime – The Codeumentary[^]

                                      1 Reply Last reply
                                      0
                                      • H honey the codewitch

                                        I don't understand why it's difficult - DFA at least. DFA is () [] * + . | That's not a whole lot to master.

                                        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                        U Offline
                                        U Offline
                                        User 13269747
                                        wrote on last edited by
                                        #81

                                        Quote:

                                        I don't understand why it's difficult - DFA at least.

                                        I am going to answer your question as thoroughly as possible. Regexes are simple in theory, not in practice, which is why people have problems with them. ========================================= In theory, that's all you need to know. In practice, that's just the start - you missed '^' (which has two meanings), '\' (used for escaping those special characters) and '$'. In $PROGRAM, which of the special characters need to be escaped? How about the replacement expression in "%s" (different to the match expression "/s")? How does it match newlines (hint: in Vim, for example, '$' vs '\n' vs '\r do all different things). Write your expression to work in vim, and it fails in your Javascript program. Write your expression in sed and it fails using the regex library in C#. The expression that works in the default invocation of grep fails in the default invocation of Perl. Use `grep -E` and the expression fails on some tools but not on others. Even passing a regex on to an engine is difficult: in an interactive bash shell you'd use sed "s/\\t//g". In a script that sets the results of that invocation to an environment variable you'd use sed "s/\\\\t//g". You run into a similar problems within your programs when you pass around string variables containing regexes, which is why even though many of the programs which use match expressions in their configuration (like nginx) have quoting and escaping rules that differ to the command-line programs which use the same regex library. When you use regex liberally in Python, Bash, Grep, Vim, C#, Perl, Javascript and everything else, you never remember how they all handle the special cases - you have to keep looking them up for that particular program. I'm fairly comfortable with them, having spent the 90s as a Perl programmer, and having used Vim as my default coding editor daily for almost 30 years during which time I collected a couple of postgraduate CS (not IT) degrees (which means I know automata theory better than most), and yet even I have to look regex stuff up on a per-product basis. I am skeptical that you can look at an expression and go "This will work in $x, $y and $z, but not in $a, $b and $c.", and if my skepticism is correct, then you have problems too, but just don't know it. And that is why people have problems with them - you never quite know which contexts allow '.' to

                                        1 Reply Last reply
                                        0
                                        • H honey the codewitch

                                          I can't think of anything that couldn't run a DFA. :confused: It's such a basic Turing-esque construction.

                                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                          J Offline
                                          J Offline
                                          jschell
                                          wrote on last edited by
                                          #82

                                          That is what I meant when I responded to your original comment.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups