Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Most unusable technology award (my nomination - regular expressions)

Most unusable technology award (my nomination - regular expressions)

Scheduled Pinned Locked Moved The Lounge
csscomalgorithmshelptutorial
65 Posts 40 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N NormDroid

    :thumbsup: Homework answered, the most cunningly disguised programming question yet.

    Software Kinetics Wear a hard hat it's under construction
    Metro RSS

    A Offline
    A Offline
    Andrew Wiles
    wrote on last edited by
    #28

    :-D But not my actual intent...........

    www.it-workplace.com
    "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

    1 Reply Last reply
    0
    • R Rhys Gravell

      A single Regex is really not sutable for UK post code, (incode and outcode), validation as there are post codes still in use that do not conform to current rules, (GIR 0AA as above). There are 6 valid post code formats plus one invalid one that's in use, I would probably validate each valid, (or invalid an in use), format individually with its own regex as what you've got there is pretty much unreadable... Either that or comment in a reference to the post code standards, (which can be found here...[^]), and apologise profusely to anyone that comes to that monstrosity after you :-)

      Rhys "Technological progress is like an axe in the hands of a pathological criminal" "Two things are infinite: the universe and human stupidity; and I'm not sure about the the universe"

      R Offline
      R Offline
      Rob Grainger
      wrote on last edited by
      #29

      That would be my approach too, and I spent a long time working with the Postal Address File in the UK. To attempt to write one regex for the whole thing leads to monstrosities like the one demonstrated. That said, I too dislike R.E.'s. They seem to me to be easy and quick to write, but tough to read. Good in an editor's search box, or tool like grep, bad in code that must be viewed by other developers. I don't object to regular languages, just the form of reg ex's that has come into use over the years. There a few examples of regular languages better done, notably in the MGrammar parser technology Microsoft tech previewed a while ago (which I guess has, unfortunately, floundered in their dev labs as they seem to have gone silent).

      C 1 Reply Last reply
      0
      • N NormDroid

        Chris Maunder wrote:

        they are so ridiculously useful

        and so elegantly terse.

        Software Kinetics Wear a hard hat it's under construction
        Metro RSS

        R Offline
        R Offline
        Rob Grainger
        wrote on last edited by
        #30

        Norm .net wrote:

        elegantly terse

        isn't that an oxymoron.

        1 Reply Last reply
        0
        • A Andrew Wiles

          Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

          www.it-workplace.com
          "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

          P Offline
          P Offline
          Pete Appleton
          wrote on last edited by
          #31

          Alternate nomination: XSL

          -- What's a signature?

          S 1 Reply Last reply
          0
          • A Andrew Wiles

            Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

            www.it-workplace.com
            "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #32

            It's not that bad. In c# you can format and comment them, and they start becoming understandable.

            1 Reply Last reply
            0
            • A Andrew Wiles

              Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

              www.it-workplace.com
              "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

              D Offline
              D Offline
              doc_net
              wrote on last edited by
              #33

              I think you are being a touch harsh on regular expressions. In this case it is the complexity of the postcode system that is causing the complexity of the regular expression not the other way round. Compare it to the regular expression for US zip codes (apologies if this is not entirely correct - this was grabbed from a quick google):

              ^(\d{5})(-\d{4})?$

              The problem is that here in the UK we use a complicated postcode system that is difficult to validate via regular expressions, but that is probably easier to read by the human eye. Plus, I would add that you have managed to validate the string in a single (albeit a very long one) line. That is a plus in my book. You could also use (again from wikipedia):

              [A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][ABD-HJLNP-UW-Z]{2}

              as a simpler more readable alternative, but which will allow some non-existent codes that do fit the normal pattern.

              1 Reply Last reply
              0
              • A Andrew Wiles

                Nearly, but doesn't cover the case where the user has not entered the space (i.e. 'LS1 9EL' vs 'LS19EL'). We also have some other occasional but common variations such as 'LS1_9EL' that we can try to parse for. My understanding is also that whilst this expression will validate the general format of the postcode there are specific exceptions that it does not cover. Unfortunately the task is not one of validating data at point of entry but matching data that has not been properly validated in the first place (>5m records), so refering to a web service such as the BING api is ruled out for performance reasons.

                www.it-workplace.com
                "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                G Offline
                G Offline
                greldak
                wrote on last edited by
                #34

                Forcing the space is trivial - every postcode ends SPACE NUMERIC ALPHA ALPHA so if there's no space just add it. The problem only comes in if you need to deal with partial entries i.e. is B11 supposed to be B1 1?? or B11 ??? If you need to properly validate as opposed to just ensuring the format is correct you will need to perform a lookup against the likes of PAF so validating at point of entry and also performing batch validation on existing data is really the best option.

                1 Reply Last reply
                0
                • A Andrew Wiles

                  Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                  www.it-workplace.com
                  "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                  R Offline
                  R Offline
                  RichK67
                  wrote on last edited by
                  #35

                  Maybe my RE-fu is strong today, but this seems fairly easy to follow - just a matter of breaking it down: GIR special case (permitted letter groups starting with each letter of alphabet) number (and additional following number) London variants (E, EC, etc) etc... However, I personally would have split the London codings away from the rest, as this RE permits say "YO1EC4 1XY" which is nonsense.

                  1 Reply Last reply
                  0
                  • A Andrew Wiles

                    Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                    www.it-workplace.com
                    "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                    H Offline
                    H Offline
                    HiteshSharma
                    wrote on last edited by
                    #36

                    I nominate windows command prompt..... but i dont agree with regular expressions being unusable. Its been years since they were developed and today also they perform their basic task very well for which they were formed - to save space in code. Always i use them i think of all the if-else conditions which would have haunted me in the absence of regular expressions

                    1 Reply Last reply
                    0
                    • W W Balboos GHB

                      I first saw that line was used in the end-of-mag article for "Language" magazine (an early issue). The article showed how to write a program that called a function that returned the cube of the numbers 1 through 10, doing so in a large number of languages for comparison. Interestingly, the shortest version was one of the Unix shells. I believe the magazine to have gone away to wherever they go to.

                      "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                      "As far as we know, our computer has never had an undetected error." - Weisert

                      "If you are searching for perfection in others, then you seek disappointment. If you are seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                      S Offline
                      S Offline
                      Stefan_Lang
                      wrote on last edited by
                      #37

                      I recall reading that line at least some 10 years ago, and I wondered where it came from. Apparently it's much older: at least from 1964

                      K 1 Reply Last reply
                      0
                      • A Andrew Wiles

                        Nearly, but doesn't cover the case where the user has not entered the space (i.e. 'LS1 9EL' vs 'LS19EL'). We also have some other occasional but common variations such as 'LS1_9EL' that we can try to parse for. My understanding is also that whilst this expression will validate the general format of the postcode there are specific exceptions that it does not cover. Unfortunately the task is not one of validating data at point of entry but matching data that has not been properly validated in the first place (>5m records), so refering to a web service such as the BING api is ruled out for performance reasons.

                        www.it-workplace.com
                        "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                        S Offline
                        S Offline
                        Stefan_Lang
                        wrote on last edited by
                        #38

                        If variable formats are that much of an issue, why not just put it through a preprocessor that standardizes the format (in your example, to weed out superfluous spaces and '_'). Sounds inpractical to me to add formating issues to the RE.

                        1 Reply Last reply
                        0
                        • P Pete Appleton

                          Alternate nomination: XSL

                          -- What's a signature?

                          S Offline
                          S Offline
                          Stefan_Lang
                          wrote on last edited by
                          #39

                          Yes, I've considered that too - and most of the XML-related stuff. But at least you have the option to make it halfway readable. It can be a real pain though if you have to delve in to other people's code and the original programmer took no care of naming conventions and formatting...

                          P T 2 Replies Last reply
                          0
                          • A Andrew Wiles

                            Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                            www.it-workplace.com
                            "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                            R Offline
                            R Offline
                            rtpHarry
                            wrote on last edited by
                            #40

                            I get where you are coming from but you are only saying it because you haven't learned regex. Its a bit like saying I dont think Spanish is a usable language because I haven't learned it :) I have casually learned it over the last few years and I couldn't live without it these days. First off you need a tool (I use Expresso, linked elsewhere in this thread). This will parse your regex into a tree that helps you understand it. It also lets you store test cases so you can quickly check what is valid and what isn't. If you so choose you can format your regex on multiple lines and and # comments explaining what each line does. That is a big regex that you have posted but its because it is trying to do something complicated. You can make a regex to recognise potential postcodes which would be a lot simpler and it would just recognise a pattern of letters+numbers but that one takes into account every known postcode and only allows valid postcodes. It depends what you're trying to achieve. Also there are sites that share regex and people can vote on them. Anyway i'm obviously a fan of regex :) Here is my vote for most unusable tech: The Brainfuck programming language http://en.wikipedia.org/wiki/Brainfuck[^]

                            R 1 Reply Last reply
                            0
                            • R rtpHarry

                              I get where you are coming from but you are only saying it because you haven't learned regex. Its a bit like saying I dont think Spanish is a usable language because I haven't learned it :) I have casually learned it over the last few years and I couldn't live without it these days. First off you need a tool (I use Expresso, linked elsewhere in this thread). This will parse your regex into a tree that helps you understand it. It also lets you store test cases so you can quickly check what is valid and what isn't. If you so choose you can format your regex on multiple lines and and # comments explaining what each line does. That is a big regex that you have posted but its because it is trying to do something complicated. You can make a regex to recognise potential postcodes which would be a lot simpler and it would just recognise a pattern of letters+numbers but that one takes into account every known postcode and only allows valid postcodes. It depends what you're trying to achieve. Also there are sites that share regex and people can vote on them. Anyway i'm obviously a fan of regex :) Here is my vote for most unusable tech: The Brainfuck programming language http://en.wikipedia.org/wiki/Brainfuck[^]

                              R Offline
                              R Offline
                              rtpHarry
                              wrote on last edited by
                              #41

                              Also... that regex is missing the special case SANTA1 so how are the Kids going to send their letters to santa? :P

                              1 Reply Last reply
                              0
                              • A Andrew Wiles

                                Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                                www.it-workplace.com
                                "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                                S Offline
                                S Offline
                                Stefan_Lang
                                wrote on last edited by
                                #42

                                Reminds me very much of some of the formulas in the Excel Sheets we're supposed to fill in for certain reports - only those are much longer! It can get pretty awful to find the problem when they reference not only other sheets, but sheets in other tables that are supposed to be in the same folder (but weren't included in the copy you were sent) :doh: On the plus side, they are slightly more verbose. Not that it helps ...

                                1 Reply Last reply
                                0
                                • S Stefan_Lang

                                  Yes, I've considered that too - and most of the XML-related stuff. But at least you have the option to make it halfway readable. It can be a real pain though if you have to delve in to other people's code and the original programmer took no care of naming conventions and formatting...

                                  P Offline
                                  P Offline
                                  Pete Appleton
                                  wrote on last edited by
                                  #43

                                  Quite true, though its possible to make a regex readable (at least in PERL) by using the "ignore whitespace" option which allows you both to break it over several lines and to embed comments see this article[^]

                                  -- What's a signature?

                                  1 Reply Last reply
                                  0
                                  • A Andrew Wiles

                                    Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                                    www.it-workplace.com
                                    "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                                    0 Offline
                                    0 Offline
                                    0bx
                                    wrote on last edited by
                                    #44

                                    You're not supposed to understand it, just go to a "regular expression website", read the description and the comments and copy/paste what you need. Never try to debug it yourself, just complain to the person who wrote it. :laugh: There's a community of "special people" for everything.

                                    Giraffes are not real.

                                    1 Reply Last reply
                                    0
                                    • S Slacker007

                                      Manfred R. Bihy wrote:

                                      If the only tool you've got is a hammer, every problem looks like a nail!

                                      I like that. :thumbsup:

                                      Just along for the ride. "the meat from that butcher is just the dogs danglies, absolutely amazing cuts of beef." - DaveAuld (2011)
                                      "No, that is just the earthly manifestation of the Great God Retardon." - Nagy Vilmos (2011) "It is the celestial scrotum of good luck!" - Nagy Vilmos (2011)

                                      M Offline
                                      M Offline
                                      Mark_Wallace
                                      wrote on last edited by
                                      #45

                                      Slacker007 wrote:

                                      Manfred R. Bihy wrote:

                                      If the only tool you've got is a hammer, every problem looks like a nail!

                                      I like that. :thumbsup:

                                      I believe it was a quote from Jesus, in an early draft of the Bible. It was discarded the final edit, along with many other gems, like "Ow, my #$%&ing thumb!"

                                      I wanna be a eunuchs developer! Pass me a bread knife!

                                      1 Reply Last reply
                                      0
                                      • A Andrew Wiles

                                        Nearly, but doesn't cover the case where the user has not entered the space (i.e. 'LS1 9EL' vs 'LS19EL'). We also have some other occasional but common variations such as 'LS1_9EL' that we can try to parse for. My understanding is also that whilst this expression will validate the general format of the postcode there are specific exceptions that it does not cover. Unfortunately the task is not one of validating data at point of entry but matching data that has not been properly validated in the first place (>5m records), so refering to a web service such as the BING api is ruled out for performance reasons.

                                        www.it-workplace.com
                                        "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                                        M Offline
                                        M Offline
                                        Mark_Wallace
                                        wrote on last edited by
                                        #46

                                        Then test for basic, then retest the exceptions. Beats the Hell out of a 300-character regex.

                                        I wanna be a eunuchs developer! Pass me a bread knife!

                                        1 Reply Last reply
                                        0
                                        • A Andrew Wiles

                                          Every now and then I need to solve a problem for which regular expressions looks like it is the "perfect" answer. Today that happens to be validating and extracting UK postal codes from addresses. BUT Every time I try to use regular expressions I find that no-one (especially me) has a clue how to use them and that all "posted" solutions can be demonstrated as flawed and therefore dangerous to use. The sheer complexity of the expressions makes them virtually impossible to read and therefore understand. I post the wikipedia solution to demonstrate my case (GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K [ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV] |YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9] [0-9]) [0-9][ABD-HJLNP-UW-Z]{2}) Can anyone think of a less usable technology?

                                          www.it-workplace.com
                                          "If a man speaks in a forest where there is no woman to hear him, is he still wrong?"

                                          M Offline
                                          M Offline
                                          Michael Haines
                                          wrote on last edited by
                                          #47

                                          Can I nominate Lotus Notes? "I am rarely happier than when spending entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand." - Douglas Adams

                                          M 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups