Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. ATL / WTL / STL
  4. Which type of Regex best to learn for programming with C?

Which type of Regex best to learn for programming with C?

Scheduled Pinned Locked Moved ATL / WTL / STL
regexjsonperlalgorithmsquestion
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    HS_C_Student
    wrote on last edited by
    #1

    I like C but I feel it's Achilles heel is string processing. I've started to do a lot of parsing of text databases in arbitrary format without documentation lately and I need to adapt. What I need to do is define patterns - expected format for the data and to store the values only if the whole string matches that known pattern. Input validation. I'd rather not run the rest of my code without verifying the input conforms. I think regular expressions are the best way to augment my existing skills without learning a new language, but regexes seem kind of varied and mixed breed. Perl (5?) Seems to have formal standardization of regexes which is supported in many searching and text editing programs. There's also PCRE which I can compile on windows or download precompiled lib/dll. Should I learn Perl regexes and use PCRE or am I overlooking things?

    L J 2 Replies Last reply
    0
    • H HS_C_Student

      I like C but I feel it's Achilles heel is string processing. I've started to do a lot of parsing of text databases in arbitrary format without documentation lately and I need to adapt. What I need to do is define patterns - expected format for the data and to store the values only if the whole string matches that known pattern. Input validation. I'd rather not run the rest of my code without verifying the input conforms. I think regular expressions are the best way to augment my existing skills without learning a new language, but regexes seem kind of varied and mixed breed. Perl (5?) Seems to have formal standardization of regexes which is supported in many searching and text editing programs. There's also PCRE which I can compile on windows or download precompiled lib/dll. Should I learn Perl regexes and use PCRE or am I overlooking things?

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      There are many websites that help to learn regexes, Expresso Regular Expression Tool[^] is a popular one. But you will also need a support library, as C does not have native support for them.

      H 1 Reply Last reply
      0
      • L Lost User

        There are many websites that help to learn regexes, Expresso Regular Expression Tool[^] is a popular one. But you will also need a support library, as C does not have native support for them.

        H Offline
        H Offline
        HS_C_Student
        wrote on last edited by
        #3

        As far as I can tell there are at least three main types; POSIX basic, POSIX extended, and Perl Compatible. There's a list of engines here: Comparison of regular expression engines - Wikipedia[^] And apparently some differences between PERL and PCRE: Perl Compatible Regular Expressions - Wikipedia[^] I don't know/understand if let's say 80 or 97 percent of the Regex syntax is the same between one version or another or if they are distinct subtypes with significant differences. I don't know if they all support ascii, Unicode, and utf encoding, or whether they are all capable of returning matched variables or if some are only providing a match/no match result.

        L 1 Reply Last reply
        0
        • H HS_C_Student

          As far as I can tell there are at least three main types; POSIX basic, POSIX extended, and Perl Compatible. There's a list of engines here: Comparison of regular expression engines - Wikipedia[^] And apparently some differences between PERL and PCRE: Perl Compatible Regular Expressions - Wikipedia[^] I don't know/understand if let's say 80 or 97 percent of the Regex syntax is the same between one version or another or if they are distinct subtypes with significant differences. I don't know if they all support ascii, Unicode, and utf encoding, or whether they are all capable of returning matched variables or if some are only providing a match/no match result.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          You will have to read the documentation to find out which one suits your requirements best.

          1 Reply Last reply
          0
          • H HS_C_Student

            I like C but I feel it's Achilles heel is string processing. I've started to do a lot of parsing of text databases in arbitrary format without documentation lately and I need to adapt. What I need to do is define patterns - expected format for the data and to store the values only if the whole string matches that known pattern. Input validation. I'd rather not run the rest of my code without verifying the input conforms. I think regular expressions are the best way to augment my existing skills without learning a new language, but regexes seem kind of varied and mixed breed. Perl (5?) Seems to have formal standardization of regexes which is supported in many searching and text editing programs. There's also PCRE which I can compile on windows or download precompiled lib/dll. Should I learn Perl regexes and use PCRE or am I overlooking things?

            J Offline
            J Offline
            John R Shaw
            wrote on last edited by
            #5

            PCRE is a good option. It is based on the Perl regexe, although there are some minor differences under the hood (which I do not remember now). I wrote my own C++ template regex some years ago that gives me full control of behavior in my personal projects. I used other libraries like PCRE, for comparison, in my test bed, to test for speed and accuracy. That is why I know that there are some minor differences on what one considers valid and invalid syntax (implementation differences or programmers mind set - who knows?). As for Cs ability to process strings or any other data type - it is very efficient. I used to be able to look a C-code and translate it, in my head, directly to the equivalent assembly code. What you are talking about is the standard C libraries, which were designed to provide only the simple low level functionality that programmers require to develop more complex algorithms (how many ways are there to write a 'strcmp' function?). It was left to others to provide libraries that required more than a simple 'for' or while 'loop' in their functions. That being said, when I find my self doing contract work on old C-code, where I am not allowed to upgrade or use external libraries, I recreate some simple algorithms for parsing (hey its their money, so who am I to argue with a brick wall). Basically, I create equivalent functions for parsing sub-strings like the regex "\d*" or "[abd]" and wrap them in a function call - depending on what I am looking for. What little testing I have done has actually shown me that they were more efficient than using the MS implementation of regex (not a surprise). Conclusion: C is the most efficient language I have ever work with - there is a reason that all of the modern operating systems, I have worked with, were written in C. (I have not checked lately, so it is possible that C++ snuck in their some were).

            INTP "Program testing can be used to show the presence of bugs, but never to show their absence." - Edsger Dijkstra "I have never been lost, but I will admit to being confused for several weeks. " - Daniel Boone

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups