Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Parsing user input

Parsing user input

Scheduled Pinned Locked Moved The Lounge
pythoncomjsontutoriallearning
37 Posts 26 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Maximilien

    You really need to parse addresses ?If you start doing that, there will always be outliers that you will miss. :confused:

    I'd rather be phishing!

    N Offline
    N Offline
    Nelek
    wrote on last edited by
    #15

    Maximilien wrote:

    If you start doing that, there will always be outliers that you will miss.

    Software development is a constant war with the universe... Developers trying to do better idiot-proof software and the universe trying to do even dumber users... So far the universe is winning

    M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

    F B 2 Replies Last reply
    0
    • M Marc Clifton

      Examples (#'s have been removed):

      P O BOX
      P.O. BOX
      PMB
      PO B0X
      PO BO X
      PO BOK
      PO BOS
      BOX

      :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

      Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

      D Offline
      D Offline
      dan sh
      wrote on last edited by
      #16

      Randomly throw them to various fields. They might not be bright enough to notice.

      "It is easy to decipher extraterrestrial signals after deciphering Javascript and VB6 themselves.", ISanti[^]

      1 Reply Last reply
      0
      • M Maximilien

        You really need to parse addresses ?If you start doing that, there will always be outliers that you will miss. :confused:

        I'd rather be phishing!

        S Offline
        S Offline
        sir_download_alot
        wrote on last edited by
        #17

        Fully agree! This is mission impossible. How can one know that "BOS" should be "BOX" and not "BOSS" or "BOSSA NOVA"? Keep it simple and no risk, no fun!

        1 Reply Last reply
        0
        • M Marc Clifton

          Examples (#'s have been removed):

          P O BOX
          P.O. BOX
          PMB
          PO B0X
          PO BO X
          PO BOK
          PO BOS
          BOX

          :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

          Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

          K Offline
          K Offline
          kalberts
          wrote on last edited by
          #18

          We have several times received paper mail where the entire name/address is no more than an alphabet soup - yet it is delivered to us no more than one day delayed. First time this happened we were really puzzled: How could the mailman know that the mail is intended for us? (It is!) Finally we realized that a keyboard "Left shift" operation would give our name and address correctly. Later, we have seen both right and left shifts, of one hand or both hands. I asked a mail guy about it, and he confirmed that is is well known: If name/address looks like alphabet soup, chances are 9 in 10 that a keyboard shift changes it to a sensible address. Maybe you should include full and partial (i.e. one-hand) right and left shifts in your user input parsing. But don't expect the shift machine instructions to be of great help for this task :-)

          1 Reply Last reply
          0
          • N Nelek

            Maximilien wrote:

            If you start doing that, there will always be outliers that you will miss.

            Software development is a constant war with the universe... Developers trying to do better idiot-proof software and the universe trying to do even dumber users... So far the universe is winning

            M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

            F Offline
            F Offline
            fmsalmeida
            wrote on last edited by
            #19

            Nelek wrote:

            Software development is a constant war with the universe... Developers trying to do better idiot-proof software and the universe trying to do even dumber users...

            You made my day with this phrase!

            N 1 Reply Last reply
            0
            • N Nelek

              Maximilien wrote:

              If you start doing that, there will always be outliers that you will miss.

              Software development is a constant war with the universe... Developers trying to do better idiot-proof software and the universe trying to do even dumber users... So far the universe is winning

              M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

              B Offline
              B Offline
              BryanFazekas
              wrote on last edited by
              #20

              The universe will always win.

              1 Reply Last reply
              0
              • M Marc Clifton

                Examples (#'s have been removed):

                P O BOX
                P.O. BOX
                PMB
                PO B0X
                PO BO X
                PO BOK
                PO BOS
                BOX

                :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

                Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                H Offline
                H Offline
                Harrison Pratt
                wrote on last edited by
                #21

                I did an mailing list cleanup like this in the Jurassic era using dBase ][. I ended up trimming excess blanks, doing upper/lower case normalization and translation table lookup for common variants to translate. I don't remember how I identified exceptions back then, but now I'd use a dialog with options to add a option to manually correct, ignore (add to lookup as IGNORE string), add a translation record. Then there is the problem of dealing with addresses foreign to your country ... whew! Yup, this a problem to be managed, not solved, if unfiltered inputs are continuously added.

                1 Reply Last reply
                0
                • M Maximilien

                  You really need to parse addresses ?If you start doing that, there will always be outliers that you will miss. :confused:

                  I'd rather be phishing!

                  M Offline
                  M Offline
                  Marc Clifton
                  wrote on last edited by
                  #22

                  Maximilien wrote:

                  You really need to parse addresses ?If you start doing that, there will always be outliers that you will miss.

                  Sadly yes. And outliers are acceptable as we're trying to fill in some form fields that break out address, PO Box, and Rural Routes, and if everything fails, the address just gets put into the Address1 field. We're aiming for improvement rather than perfection. :) Marc

                  Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                  1 Reply Last reply
                  0
                  • T Tim Carmichael

                    When we put our mail on vacation hold, it validates and 'normalizes' the address, so I do understand what you're working with. Where I grew up, our address was RR#1; it wasn't until I was in my teens that we had an address with a number and street name. So.. consider this.. are you only dealing with P.O. and its variants or do you have R.R. addresses as well?

                    E Offline
                    E Offline
                    englebart
                    wrote on last edited by
                    #23

                    I think the counties try to eliminate RR addresses when they implement 911 emergency service. Ambulance dispatcher: Code red, RR 23, box 99 ... Ambulance navigator: We are on the correct Route... 1 mailbox, 2 mailbox, etc...

                    1 Reply Last reply
                    0
                    • M Marc Clifton

                      Examples (#'s have been removed):

                      P O BOX
                      P.O. BOX
                      PMB
                      PO B0X
                      PO BO X
                      PO BOK
                      PO BOS
                      BOX

                      :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

                      Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                      E Offline
                      E Offline
                      englebart
                      wrote on last edited by
                      #24

                      My guess on the "K" is that some robot filled it in based on a record created via OCR. The United States Post office has a service you can use to "normalize" addresses. I suspect that each country has something similar. There is probably a service provider that aggregates all of these normalization services into one spot. (Amazon?)

                      1 Reply Last reply
                      0
                      • M Marc Clifton

                        Examples (#'s have been removed):

                        P O BOX
                        P.O. BOX
                        PMB
                        PO B0X
                        PO BO X
                        PO BOK
                        PO BOS
                        BOX

                        :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

                        Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                        D Offline
                        D Offline
                        Dan Neely
                        wrote on last edited by
                        #25

                        Marc Clifton wrote:

                        The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'.

                        Optically Corrupted Recognition?

                        Did you ever see history portrayed as an old man with a wise brow and pulseless heart, waging all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt

                        1 Reply Last reply
                        0
                        • M Marc Clifton

                          RR, CR, HC, etc., as well as regular street addresses (as best as those are). Perfect accuracy is not necessary, just best guess. :) Marc

                          Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                          A Offline
                          A Offline
                          agolddog
                          wrote on last edited by
                          #26

                          Well, then just parse the city & state/province and geocode to the center of that.

                          1 Reply Last reply
                          0
                          • G Gary Wheeler

                            I smell OCR in the mix - hence the BOK, BOS, B0X, etc.

                            Software Zen: delete this;

                            V Offline
                            V Offline
                            vtokar
                            wrote on last edited by
                            #27

                            :thumbsup:

                            1 Reply Last reply
                            0
                            • T Tim Carmichael

                              When we put our mail on vacation hold, it validates and 'normalizes' the address, so I do understand what you're working with. Where I grew up, our address was RR#1; it wasn't until I was in my teens that we had an address with a number and street name. So.. consider this.. are you only dealing with P.O. and its variants or do you have R.R. addresses as well?

                              K Offline
                              K Offline
                              kristopher baker
                              wrote on last edited by
                              #28

                              Excellent point. Are there services that allow you to force user input validation of addresses against the USPS databases?

                              1 Reply Last reply
                              0
                              • M Marc Clifton

                                Examples (#'s have been removed):

                                P O BOX
                                P.O. BOX
                                PMB
                                PO B0X
                                PO BO X
                                PO BOK
                                PO BOS
                                BOX

                                :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

                                Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                                M Offline
                                M Offline
                                MikeTheFid
                                wrote on last edited by
                                #29

                                Your could fashion the UI to eliminate the need to parse P.O. Box... etc. Have a drop down that contains these options: Street #, P.O. Box, RR#, CR, HC, etc And to the right of it, place a text box that accepts the actual number. Just a thought off the top.

                                Cheers, Mike Fidler "I intend to live forever - so far, so good." Steven Wright "I almost had a psychic girlfriend but she left me before we met." Also Steven Wright "I'm addicted to placebos. I could quit, but it wouldn't matter." Steven Wright yet again.

                                1 Reply Last reply
                                0
                                • C Chris Losinger

                                  welcome to my life

                                  V Offline
                                  V Offline
                                  Vikram A Punathambekar
                                  wrote on last edited by
                                  #30

                                  Woah... haven't seen you in a long time Chris. How's it going these days?

                                  Cheers, विक्रम "We have already been through this, I am not going to repeat myself." - fat_boy, in a global warming thread :doh:

                                  C 1 Reply Last reply
                                  0
                                  • V Vikram A Punathambekar

                                    Woah... haven't seen you in a long time Chris. How's it going these days?

                                    Cheers, विक्रम "We have already been through this, I am not going to repeat myself." - fat_boy, in a global warming thread :doh:

                                    C Offline
                                    C Offline
                                    Chris Losinger
                                    wrote on last edited by
                                    #31

                                    i'm here occasionally. not constantly, as previously. it goes... on and on and on and on. :)

                                    image processing toolkits | batch image processing

                                    V 1 Reply Last reply
                                    0
                                    • F fmsalmeida

                                      Nelek wrote:

                                      Software development is a constant war with the universe... Developers trying to do better idiot-proof software and the universe trying to do even dumber users...

                                      You made my day with this phrase!

                                      N Offline
                                      N Offline
                                      Nelek
                                      wrote on last edited by
                                      #32

                                      You are welcome :) :-D

                                      M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

                                      1 Reply Last reply
                                      0
                                      • M Marc Clifton

                                        Examples (#'s have been removed):

                                        P O BOX
                                        P.O. BOX
                                        PMB
                                        PO B0X
                                        PO BO X
                                        PO BOK
                                        PO BOS
                                        BOX

                                        :sigh: The one with the 'K' is interesting. 'K' is on the opposite side of the keyboard -- I can understand the 'S'. The hardest part about parsing crap like this (there are 166,333 records) is determining what other variants I did not parse correctly (for example, considered as a street address, not a PO Box), not which ones I successfully accounted for. Marc

                                        Latest Article - Create a Dockerized Python Fiddle Web App Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny Artificial intelligence is the only remedy for natural stupidity. - CDP1802

                                        L Offline
                                        L Offline
                                        Lost User
                                        wrote on last edited by
                                        #33

                                        Just say: "We don't deliver to postal boxes; addresses only". (For real-time, I use online address validation services).

                                        "(I) am amazed to see myself here rather than there ... now rather than then". ― Blaise Pascal

                                        1 Reply Last reply
                                        0
                                        • C Chris Losinger

                                          i'm here occasionally. not constantly, as previously. it goes... on and on and on and on. :)

                                          image processing toolkits | batch image processing

                                          V Offline
                                          V Offline
                                          Vikram A Punathambekar
                                          wrote on last edited by
                                          #34

                                          I still remember your old profile pic - with hand on your thoughtful face. Got it somewhere? :)

                                          Cheers, विक्रम "We have already been through this, I am not going to repeat myself." - fat_boy, in a global warming thread :doh:

                                          R 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups