Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I love regular expressions

I love regular expressions

Scheduled Pinned Locked Moved The Lounge
designcomgraphicsiot
83 Posts 36 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H honey the codewitch

    At least the non-backtracking subset. DFA regular expressions. - they are a compact way to describe a simple syntax - they are plain text and brief, easily communicatable and transferable - they are cross platform (at least DFA), running in most any engine - they are incredibly efficient (again, DFA) - they are versatile, able to do validation, tokenization, and matching as well That's probably why they will always be with us. They are maybe the perfect canonical execution of a Chomsky type 3 language. Sure, they can be really terse, but this is as much a strength as it is a weakness, because it facilitates some of the above. I know some people hate them, and I can understand that. But show me a better way.

    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

    R Offline
    R Offline
    Ron Anders
    wrote on last edited by
    #56

    This is the most runaway thread on CP in a long while. Way to poke the bear.

    1 Reply Last reply
    0
    • H honey the codewitch

      At least the non-backtracking subset. DFA regular expressions. - they are a compact way to describe a simple syntax - they are plain text and brief, easily communicatable and transferable - they are cross platform (at least DFA), running in most any engine - they are incredibly efficient (again, DFA) - they are versatile, able to do validation, tokenization, and matching as well That's probably why they will always be with us. They are maybe the perfect canonical execution of a Chomsky type 3 language. Sure, they can be really terse, but this is as much a strength as it is a weakness, because it facilitates some of the above. I know some people hate them, and I can understand that. But show me a better way.

      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #57

      I blank out on regular expressions. Too much like learning a new language. The other day, I needed to get 0 or more leading characters from a string (as an int). I Googled (regex), I went, I left. Coding challenge: get leading digits (I settled for LINQ)

      var text = "123rd NY 2nd Battalion".
      //
      var count = text.TakeWhile( c => Char.IsDigit( c ) ).Count();
      int i = count > 0 ? ConvertToInt32( text.SubString( 0, count) ) : 0;

      Answer: 123

      "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

      H B 2 Replies Last reply
      0
      • H honey the codewitch

        It's because I slept. :) But yeah, I get bit overwhelmed when I get a lot of responses, so I kind of respond as I'm able.

        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

        M Offline
        M Offline
        Mark Starr
        wrote on last edited by
        #58

        I didn’t mean to imply you didn’t participate in the conversation, rather that the topic usually starts up a lively discussion. :cool:

        Time is the differentiation of eternity devised by man to measure the passage of human events. - Manly P. Hall Mark Just another cog in the wheel

        1 Reply Last reply
        0
        • L Lost User

          I blank out on regular expressions. Too much like learning a new language. The other day, I needed to get 0 or more leading characters from a string (as an int). I Googled (regex), I went, I left. Coding challenge: get leading digits (I settled for LINQ)

          var text = "123rd NY 2nd Battalion".
          //
          var count = text.TakeWhile( c => Char.IsDigit( c ) ).Count();
          int i = count > 0 ? ConvertToInt32( text.SubString( 0, count) ) : 0;

          Answer: 123

          "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

          H Offline
          H Offline
          honey the codewitch
          wrote on last edited by
          #59

          ^[0-9]+

          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

          1 Reply Last reply
          0
          • H honey the codewitch

            At least the non-backtracking subset. DFA regular expressions. - they are a compact way to describe a simple syntax - they are plain text and brief, easily communicatable and transferable - they are cross platform (at least DFA), running in most any engine - they are incredibly efficient (again, DFA) - they are versatile, able to do validation, tokenization, and matching as well That's probably why they will always be with us. They are maybe the perfect canonical execution of a Chomsky type 3 language. Sure, they can be really terse, but this is as much a strength as it is a weakness, because it facilitates some of the above. I know some people hate them, and I can understand that. But show me a better way.

            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

            B Offline
            B Offline
            BernardIE5317
            wrote on last edited by
            #60

            i enjoy the challenge of learning / writing regular expressions for my text editing of source code . as in all things mastering it is the same as being invited to perform at Carnegie Hall id est "practice practice practice" . my purpose in this post though is to express my impression from these many and expert posts utilizing fancy schmancy terms of which i do not know which seem to indicate regular expressions are more powerful / advanced / sophisticated than i know as i understand them to be nothing more than a text editing convenience . may i please inquire am i wrong in this regard . thank you kindly .

            H J 2 Replies Last reply
            0
            • B BernardIE5317

              i enjoy the challenge of learning / writing regular expressions for my text editing of source code . as in all things mastering it is the same as being invited to perform at Carnegie Hall id est "practice practice practice" . my purpose in this post though is to express my impression from these many and expert posts utilizing fancy schmancy terms of which i do not know which seem to indicate regular expressions are more powerful / advanced / sophisticated than i know as i understand them to be nothing more than a text editing convenience . may i please inquire am i wrong in this regard . thank you kindly .

              H Offline
              H Offline
              honey the codewitch
              wrote on last edited by
              #61

              They're for text processing, but for more than text editing. The C# compiler for example, almost certainly uses a tokenizer built up of regular expressions. That said, you're basically not wrong. I mean, tokenization is almost like text matching, but with a twist.

              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

              1 Reply Last reply
              0
              • G giulicard

                Joking aside, more trivially I believe the term "regular" refers to the third level of Chomsky's hierarchy, which, precisely, is defined as Type3-Regular. DFA (Deterministic Finite Automaton) are FSA (Finite State Automaton).

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #62

                Can confirm.

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                1 Reply Last reply
                0
                • H honey the codewitch

                  A modest proposal: Learn the DFA subset. Commit it to memory, and forget the rest. DFA is the non-backtracking subset of regular expressions () - capture and group [] - match char ranges * - match zero or more + - match one or more ? - match zero or one . - match any single character | - match a or b (a|b)

                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                  J Offline
                  J Offline
                  jmaida
                  wrote on last edited by
                  #63

                  Agree. A good and practical recommendation!

                  "A little time, a little trouble, your better day" Badfinger

                  1 Reply Last reply
                  0
                  • G GuyThiebaut

                    I used ChatGPT precisely for that and it returned a decent regex with an explanation. I needed to word my question in a manner that was generic but the result was actually helpful.

                    “That which can be asserted without evidence, can be dismissed without evidence.”

                    ― Christopher Hitchens

                    J Offline
                    J Offline
                    jmaida
                    wrote on last edited by
                    #64

                    cool. forgot chatgpt is hanging out there

                    "A little time, a little trouble, your better day" Badfinger

                    G 1 Reply Last reply
                    0
                    • K k5054

                      Not keeping your tool to yourself is one of the leading causes of dismissal, too.

                      "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                      J Offline
                      J Offline
                      jmaida
                      wrote on last edited by
                      #65

                      lordy... "its a bad worker who blames their tools." "there is no tool like an old tool." :)

                      "A little time, a little trouble, your better day" Badfinger

                      1 Reply Last reply
                      0
                      • D David ONeil

                        How the hell would someone know that "[0-9]{1,3}" is enough to find sequential numbers in Microsoft Word? How the hell did I find that magic? Every regular expression seems to require a convoluted google search. Crazy world...

                        Our Forgotten Astronomy | Object Oriented Programming with C++ | Wordle solver

                        J Offline
                        J Offline
                        jmaida
                        wrote on last edited by
                        #66

                        yikes, i forgot how complex regex can get. gives one a headache trying to parse them, much less compose one. very worthy topic though.

                        "A little time, a little trouble, your better day" Badfinger

                        1 Reply Last reply
                        0
                        • H honey the codewitch

                          A modest proposal: Learn the DFA subset. Commit it to memory, and forget the rest. DFA is the non-backtracking subset of regular expressions () - capture and group [] - match char ranges * - match zero or more + - match one or more ? - match zero or one . - match any single character | - match a or b (a|b)

                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                          K Offline
                          K Offline
                          k5054
                          wrote on last edited by
                          #67

                          would not ? match zero or one not be part of that? or is that a posix extension?

                          "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                          H 1 Reply Last reply
                          0
                          • J jmaida

                            cool. forgot chatgpt is hanging out there

                            "A little time, a little trouble, your better day" Badfinger

                            G Offline
                            G Offline
                            GuyThiebaut
                            wrote on last edited by
                            #68

                            I had exactly the dilemna you had - I was looking for a reverse regular expression parser i.e. create a regular expression from a desired result and an initial string and I thought I would give ChatGPT a go.

                            “That which can be asserted without evidence, can be dismissed without evidence.”

                            ― Christopher Hitchens

                            1 Reply Last reply
                            0
                            • K k5054

                              would not ? match zero or one not be part of that? or is that a posix extension?

                              "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                              H Offline
                              H Offline
                              honey the codewitch
                              wrote on last edited by
                              #69

                              You're right. Totally spaced that I edited.

                              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                              1 Reply Last reply
                              0
                              • H honey the codewitch

                                jschell wrote:

                                Not sure what you mean by that.

                                What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it. And yeah, that encompasses many different engines, which themselves are what run on a particular platform, unless you're doing code generation, which I do sometimes for them so I don't have to include the regex engine in my firmware. That code is easy to make cross platform. You'd almost have to put in extra effort to make it otherwise. :) I was maybe trying to be too brief by half. I assumed the meaning would come through, but I guess not.

                                jschell wrote:

                                Perhaps you were referring to that the simplest syntax works in different engines though

                                In part yes, but also, virtually every platform has a DFA regex engine for it, or alternatively you can generate DFA code for that platform, with something such as my rxcg project. I was intending to imply that as well.

                                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                J Offline
                                J Offline
                                jschell
                                wrote on last edited by
                                #70

                                honey the codewitch wrote:

                                What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it.

                                If you find a programming system that doesn't allow that then you might prepare for the universe to end. Far as I can recall it would be mathematically impossible for that not to be true.

                                H 1 Reply Last reply
                                0
                                • H honey the codewitch

                                  I recall this too. Qwerty I heard was in part laid out to slow typists down so the mechanical typewriter could keep up. I heard it from the Beagle Bros back in the 1980s so I don't know how true it is. Either way, presumably eventually that wasn't an issue anymore. And Devorak was a common alternative, or at least common as qwerty alternatives go. It was touted as better, but despite the hype I remember reading that it didn't actually improve people's WPM.

                                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                  J Offline
                                  J Offline
                                  jschell
                                  wrote on last edited by
                                  #71

                                  honey the codewitch wrote:

                                  out to slow typists down so the mechanical typewriter could keep up

                                  Not exactly. Rather the layout exists to allow typing to be faster. QWERTY - Wikipedia[^] "but rather to speed up typing. Indeed, there is evidence that, aside from the issue of jamming, placing often-used keys farther apart increases typing speed, because it encourages alternation between the hands."

                                  honey the codewitch wrote:

                                  but despite the hype

                                  I believe that was the one where the data was faked.

                                  1 Reply Last reply
                                  0
                                  • J jschell

                                    honey the codewitch wrote:

                                    What I mean is that regardless of the platform you choose, there is a way to run a DFA regular expression on it.

                                    If you find a programming system that doesn't allow that then you might prepare for the universe to end. Far as I can recall it would be mathematically impossible for that not to be true.

                                    H Offline
                                    H Offline
                                    honey the codewitch
                                    wrote on last edited by
                                    #72

                                    jschell wrote:

                                    If you find a programming system that doesn't allow that then you might prepare for the universe to end

                                    Maybe I misunderstand you, but if you're speaking in the general sense, you aren't going to run a garbage collected system for example, on an 8-bit platform with 4KB of RAM, hopefully. Even if you could, it wouldn't be practical for anything. A DFA on the other hand will run handily there.

                                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                    J 1 Reply Last reply
                                    0
                                    • H honey the codewitch

                                      jschell wrote:

                                      If you find a programming system that doesn't allow that then you might prepare for the universe to end

                                      Maybe I misunderstand you, but if you're speaking in the general sense, you aren't going to run a garbage collected system for example, on an 8-bit platform with 4KB of RAM, hopefully. Even if you could, it wouldn't be practical for anything. A DFA on the other hand will run handily there.

                                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                      J Offline
                                      J Offline
                                      jschell
                                      wrote on last edited by
                                      #73

                                      honey the codewitch wrote:

                                      A DFA on the other hand will run handily there.

                                      Reverse that though. What system, which has resources to run anything that is non-trivial, will not run a DFA?

                                      H 1 Reply Last reply
                                      0
                                      • B BernardIE5317

                                        i enjoy the challenge of learning / writing regular expressions for my text editing of source code . as in all things mastering it is the same as being invited to perform at Carnegie Hall id est "practice practice practice" . my purpose in this post though is to express my impression from these many and expert posts utilizing fancy schmancy terms of which i do not know which seem to indicate regular expressions are more powerful / advanced / sophisticated than i know as i understand them to be nothing more than a text editing convenience . may i please inquire am i wrong in this regard . thank you kindly .

                                        J Offline
                                        J Offline
                                        jschell
                                        wrote on last edited by
                                        #74

                                        If you like to read technical stuff then "Mastering Regular Expressions" by Friedl (yes spelled like that.) Not only interesting but a bit scary since it provides examples that will shut down your system.

                                        B 1 Reply Last reply
                                        0
                                        • J jschell

                                          honey the codewitch wrote:

                                          A DFA on the other hand will run handily there.

                                          Reverse that though. What system, which has resources to run anything that is non-trivial, will not run a DFA?

                                          H Offline
                                          H Offline
                                          honey the codewitch
                                          wrote on last edited by
                                          #75

                                          I can't think of anything that couldn't run a DFA. :confused: It's such a basic Turing-esque construction.

                                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                          J 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups