Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. Help with regex HTML form validation Part 2

Help with regex HTML form validation Part 2

Scheduled Pinned Locked Moved Regular Expressions
regexhtmldatabasetestingbeta-testing
16 Posts 2 Posters 12 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Matt T Heffron

    Don't forget the characters that include diacritical marks. E.g., ö Å ç

    A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

    R Offline
    R Offline
    robwm1
    wrote on last edited by
    #3

    Is there a way to check for that without having to list every Unicode character? I didn't see any accented names in our database but that certainly doesn't mean it can't happen in the future. I'd prefer to not include all Unicode characters. Just the ones with a high likelihood of showing up. I imagine that it could only be characters that would be accepted by Active Directory.

    M 1 Reply Last reply
    0
    • R robwm1

      Is there a way to check for that without having to list every Unicode character? I didn't see any accented names in our database but that certainly doesn't mean it can't happen in the future. I'd prefer to not include all Unicode characters. Just the ones with a high likelihood of showing up. I imagine that it could only be characters that would be accepted by Active Directory.

      M Offline
      M Offline
      Matt T Heffron
      wrote on last edited by
      #4

      At least with the .NET Regex http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#CategoryOrBlock[^] (I don't know about others) you can specify the Unicode character category (for "Letter") so your regex would be:

      ^[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Lm}\-\s']+$

      possibly even just

      ^[\p{L}\-\s']+$

      A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

      R 2 Replies Last reply
      0
      • M Matt T Heffron

        At least with the .NET Regex http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#CategoryOrBlock[^] (I don't know about others) you can specify the Unicode character category (for "Letter") so your regex would be:

        ^[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Lm}\-\s']+$

        possibly even just

        ^[\p{L}\-\s']+$

        A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

        R Offline
        R Offline
        robwm1
        wrote on last edited by
        #5

        After looking at that link, a person could go crazy trying to catch every possibility. Looks like regex can be very thorough! Thanks for the help!

        M 1 Reply Last reply
        0
        • R robwm1

          After looking at that link, a person could go crazy trying to catch every possibility. Looks like regex can be very thorough! Thanks for the help!

          M Offline
          M Offline
          Matt T Heffron
          wrote on last edited by
          #6

          Yes!! There's a reason the "Mastering Regular Expressions" book[^] is 496 pages!!! :omg:

          A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

          1 Reply Last reply
          0
          • M Matt T Heffron

            At least with the .NET Regex http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#CategoryOrBlock[^] (I don't know about others) you can specify the Unicode character category (for "Letter") so your regex would be:

            ^[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Lm}\-\s']+$

            possibly even just

            ^[\p{L}\-\s']+$

            A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

            R Offline
            R Offline
            robwm1
            wrote on last edited by
            #7

            How would you allow for a period only at the end of the string where in the case a name ends in Jr. or Sr.? A period wouldn't normally appear in any other position in a last name. I'm going with the pattern below so far. I'm double checking names in Active Directory but I'm reasonably sure you can't use diacritical characters. I need to research that to be certain. ^[a-zA-Z\-\s']+$

            M 2 Replies Last reply
            0
            • R robwm1

              How would you allow for a period only at the end of the string where in the case a name ends in Jr. or Sr.? A period wouldn't normally appear in any other position in a last name. I'm going with the pattern below so far. I'm double checking names in Active Directory but I'm reasonably sure you can't use diacritical characters. I need to research that to be certain. ^[a-zA-Z\-\s']+$

              M Offline
              M Offline
              Matt T Heffron
              wrote on last edited by
              #8

              ^[a-zA-Z\-\s']+\.$

              Add the \. right before the $

              A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

              R 2 Replies Last reply
              0
              • M Matt T Heffron

                ^[a-zA-Z\-\s']+\.$

                Add the \. right before the $

                A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                R Offline
                R Offline
                robwm1
                wrote on last edited by
                #9

                That works perfect. I'm really starting to get the hang of this.

                M 1 Reply Last reply
                0
                • R robwm1

                  How would you allow for a period only at the end of the string where in the case a name ends in Jr. or Sr.? A period wouldn't normally appear in any other position in a last name. I'm going with the pattern below so far. I'm double checking names in Active Directory but I'm reasonably sure you can't use diacritical characters. I need to research that to be certain. ^[a-zA-Z\-\s']+$

                  M Offline
                  M Offline
                  Matt T Heffron
                  wrote on last edited by
                  #10

                  I'd be awfully surprised if the only characters allowed in Active Directory worldwide are the basic ASCII-ish letters.

                  A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                  R 1 Reply Last reply
                  0
                  • M Matt T Heffron

                    I'd be awfully surprised if the only characters allowed in Active Directory worldwide are the basic ASCII-ish letters.

                    A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                    R Offline
                    R Offline
                    robwm1
                    wrote on last edited by
                    #11

                    I know we have a least one person that has an accented 'e' in their last name but it's not that way in Active Directory. I don't know if that is due the person making the entry didn't know how to make the accented character or it was disallowed. I'll definitely research to be sure before I make a final decision to leave it out. I will post my findings here.

                    1 Reply Last reply
                    0
                    • R robwm1

                      That works perfect. I'm really starting to get the hang of this.

                      M Offline
                      M Offline
                      Matt T Heffron
                      wrote on last edited by
                      #12

                      Checkout the Expresso[^] tool (free) to explore regular expressions!

                      A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                      R 1 Reply Last reply
                      0
                      • M Matt T Heffron

                        Checkout the Expresso[^] tool (free) to explore regular expressions!

                        A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                        R Offline
                        R Offline
                        robwm1
                        wrote on last edited by
                        #13

                        Right, that is actually the tool I'm using. I bumped into it a couple of years ago but this is the first time I ever used regex.

                        1 Reply Last reply
                        0
                        • M Matt T Heffron

                          ^[a-zA-Z\-\s']+\.$

                          Add the \. right before the $

                          A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                          R Offline
                          R Offline
                          robwm1
                          wrote on last edited by
                          #14

                          Well, this pattern was working yesterday on a different computer at work. I installed Expresso on my personal computer so I could work on my project over the weekend and now the pattern is not working. ^[a-zA-Z\-\s']+\.$ john1 = no matches The pattern should match the number one because numbers are not allowed but the results are blank when I run this pattern. I could have sworn that this was working yesterday. EDIT: I did some further testing and discovered that the \. is breaking the pattern. If there is no period at the end; then count = 0. This pattern seems to require the period at the end and then it works correctly. The period should be allowed 0 or 1 times at the end of the string. So the pattern below is working the way I want it to in Expresso but not when I use it in an HTA using vbscript to do the pattern matching. Vbscript is throwing an error at the line where the pattern is executed. ^[a-zA-Z\-\s']+?\.$ Not sure how to make a pattern that works in Expresso to also work with vbscript. SOLUTION: ^[a-zA-Z\-\s']+?\.$ This pattern works when testing in Expresso but doesn't work with vbscript although this may work when used with other languages. ^[a-zA-Z\-\s']+\.{0,1}$ This is the pattern that behaves the same way as the pattern above but also works with vbscript. MATCHES: Jones Jones-Smith Jones Smith (no hyphen) O'Leary Van Allen (no hyphen) Vander Ark (no hyphen) Jones Sr. Although this doesn't address diacritical characters, a few conversations with colleagues resulted in the decision that the risk is very low that they will be used in Active Directory. We currently have only 3 techs making entries into AD so informing them of how this pattern works will reduce the risk even further. I have worked for my organization for 14 years and no diacritical characters have been used until now so I feel pretty safe in not testing for them. It may not be the ultimate approach such as selling a product to the public but it does meet the needs of the specifications that were given to me. Thank you! - I'd like to give a shout out to everyone who helped me out with this project! I really appreciate all of you taking the time to steer me in the right direction! I would go as far as to say that CodeProject could be just as valuable as sitting in any classroom. You may not get a certification here but the knowledge gained is invaluable. I was able to gain a solid understanding of regex in a matter of a few hours. I watched several videos but I wo

                          M 1 Reply Last reply
                          0
                          • R robwm1

                            Well, this pattern was working yesterday on a different computer at work. I installed Expresso on my personal computer so I could work on my project over the weekend and now the pattern is not working. ^[a-zA-Z\-\s']+\.$ john1 = no matches The pattern should match the number one because numbers are not allowed but the results are blank when I run this pattern. I could have sworn that this was working yesterday. EDIT: I did some further testing and discovered that the \. is breaking the pattern. If there is no period at the end; then count = 0. This pattern seems to require the period at the end and then it works correctly. The period should be allowed 0 or 1 times at the end of the string. So the pattern below is working the way I want it to in Expresso but not when I use it in an HTA using vbscript to do the pattern matching. Vbscript is throwing an error at the line where the pattern is executed. ^[a-zA-Z\-\s']+?\.$ Not sure how to make a pattern that works in Expresso to also work with vbscript. SOLUTION: ^[a-zA-Z\-\s']+?\.$ This pattern works when testing in Expresso but doesn't work with vbscript although this may work when used with other languages. ^[a-zA-Z\-\s']+\.{0,1}$ This is the pattern that behaves the same way as the pattern above but also works with vbscript. MATCHES: Jones Jones-Smith Jones Smith (no hyphen) O'Leary Van Allen (no hyphen) Vander Ark (no hyphen) Jones Sr. Although this doesn't address diacritical characters, a few conversations with colleagues resulted in the decision that the risk is very low that they will be used in Active Directory. We currently have only 3 techs making entries into AD so informing them of how this pattern works will reduce the risk even further. I have worked for my organization for 14 years and no diacritical characters have been used until now so I feel pretty safe in not testing for them. It may not be the ultimate approach such as selling a product to the public but it does meet the needs of the specifications that were given to me. Thank you! - I'd like to give a shout out to everyone who helped me out with this project! I really appreciate all of you taking the time to steer me in the right direction! I would go as far as to say that CodeProject could be just as valuable as sitting in any classroom. You may not get a certification here but the knowledge gained is invaluable. I was able to gain a solid understanding of regex in a matter of a few hours. I watched several videos but I wo

                            M Offline
                            M Offline
                            Matt T Heffron
                            wrote on last edited by
                            #15

                            robwm1 wrote:

                            ^[a-zA-Z\-\s']+?\.$

                            This was so.... close. When I suggested the \. I forgot the conditional aspect of the the dot at the end. (Sorry.) Just move the ? to be after the \.

                            ^[a-zA-Z\-\s']+\.?$

                            the ? means exactly the same thing as {0,1}

                            A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                            R 1 Reply Last reply
                            0
                            • M Matt T Heffron

                              robwm1 wrote:

                              ^[a-zA-Z\-\s']+?\.$

                              This was so.... close. When I suggested the \. I forgot the conditional aspect of the the dot at the end. (Sorry.) Just move the ? to be after the \.

                              ^[a-zA-Z\-\s']+\.?$

                              the ? means exactly the same thing as {0,1}

                              A positive attitude may not solve every problem, but it will annoy enough people to be worth the effort.

                              R Offline
                              R Offline
                              robwm1
                              wrote on last edited by
                              #16

                              I never thought to move the ? to the end. You're right though, it is the same result as {0,1}. Thanks again!

                              1 Reply Last reply
                              0
                              Reply
                              • Reply as topic
                              Log in to reply
                              • Oldest to Newest
                              • Newest to Oldest
                              • Most Votes


                              • Login

                              • Don't have an account? Register

                              • Login or register to search.
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • World
                              • Users
                              • Groups