Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Unicode testing

Unicode testing

Scheduled Pinned Locked Moved The Lounge
comtestingbeta-testinghelp
25 Posts 18 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Tad McClellan

    So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

    TadMcClellan.Com

    L Offline
    L Offline
    Luc 648011
    wrote on last edited by
    #5

    what was in the product requirements document? :)

    T 1 Reply Last reply
    0
    • T Tad McClellan

      So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

      TadMcClellan.Com

      D Offline
      D Offline
      Duncan Edwards Jones
      wrote on last edited by
      #6

      any Right-to-Left languages?

      '--8<------------------------ Ex Datis: Duncan Jones Merrion Computing Ltd

      T 1 Reply Last reply
      0
      • T Tad McClellan

        So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

        TadMcClellan.Com

        M Offline
        M Offline
        Member 96
        wrote on last edited by
        #7

        In my experience test with English, Thai, Chinese and that European language where two a's in a row are considered a different character (I forget, Norwegian or Swedish or something like that) if you want to be sure. Otherwise just test for each market you're after.


        "Creating your own blog is about as easy as creating your own urine, and you're about as likely to find someone else interested in it." -- Lore Sjöberg

        J 1 Reply Last reply
        0
        • T Tad McClellan

          So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

          TadMcClellan.Com

          Y Offline
          Y Offline
          Yusuf
          wrote on last edited by
          #8

          Tad McClellan wrote:

          Seriously I feel like I'm stuck in a Dilbert cartoon.

          why? you know the one who is testing it, are you? let them test for all possible languages which number around 5000. :-\ Seriously though, it does not take that long. They don't need to go extensive through all possible languages. :-O

          Yusuf Oh didn't you notice, analogous to square roots, they recently introduced rectangular, circular, and diamond roots to determine the size of the corresponding shapes when given the area. Luc Pattyn[^]

          1 Reply Last reply
          0
          • T Tad McClellan

            So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

            TadMcClellan.Com

            P Offline
            P Offline
            Phil Martin
            wrote on last edited by
            #9

            It sounds like your testers know what they are doing. In my experience the word "should" is the antithesis of quality software. I've lost count of the number of times I've been bitten in the past my slightly strange behaviors of other languages. All through my own fault of inadequate engineering, but the only reason they were found was because of the testing staff. Gold stars for them I say. - Phil

            1 Reply Last reply
            0
            • T Tad McClellan

              So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

              TadMcClellan.Com

              M Offline
              M Offline
              Mycroft Holmes
              wrote on last edited by
              #10

              The thing about testers is they are anally retentive, don't get me wrong, this is a good thing. It is the hallmark of a good tester, their ability to nitpick their way through your app is one of the most prized talents you can come across. You do of course need the ability to receive constructive criticism.

              Tad McClellan wrote:

              Seriously I feel like I'm stuck in a Dilbert cartoon.

              Imagine how they feel, can't this stupid dev SEE that problem. Seriously I think you have a good tester there, let him play to his hearts content. Not your job to place constraints on his time, management will do that, after all that what they are there for right.

              Never underestimate the power of human stupidity RAH

              1 Reply Last reply
              0
              • L Luc 648011

                what was in the product requirements document? :)

                T Offline
                T Offline
                Tad McClellan
                wrote on last edited by
                #11

                Just unicode. The point is that If Chinese works then English will work. Thats what happens when non-techinical people start making decisions about the technology.

                TadMcClellan.Com

                L F 2 Replies Last reply
                0
                • D Duncan Edwards Jones

                  any Right-to-Left languages?

                  '--8<------------------------ Ex Datis: Duncan Jones Merrion Computing Ltd

                  T Offline
                  T Offline
                  Tad McClellan
                  wrote on last edited by
                  #12

                  No thank God!

                  TadMcClellan.Com

                  1 Reply Last reply
                  0
                  • T Tad McClellan

                    Just unicode. The point is that If Chinese works then English will work. Thats what happens when non-techinical people start making decisions about the technology.

                    TadMcClellan.Com

                    L Offline
                    L Offline
                    Luc 648011
                    wrote on last edited by
                    #13

                    Tad McClellan wrote:

                    Just unicode.

                    IMO two things went wrong: 1. "Unicode" is a bad spec. Requirements must use functional terms as much as possible, and avoid technical terms. So it should list the languages that are required: English, full Chinese (not simplified Chinese), ... so you can come up with a design, an implementation plan and a test plan. 2. The test plan should be created earlier in the project. It is part of validating the requirements document! :)

                    1 Reply Last reply
                    0
                    • T Tad McClellan

                      So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

                      TadMcClellan.Com

                      H Offline
                      H Offline
                      Henry Minute
                      wrote on last edited by
                      #14

                      Every time this sort of topic comes up in The Lounge, one member (can't remember his name for the moment). pops up and avers that Turkish will trip it up. Where is he now that you need him? Anyway, I'd add it to the list of test languages, as well.

                      Henry Minute Do not read medical books! You could die of a misprint. - Mark Twain Girl: (staring) "Why do you need an icy cucumber?" “I want to report a fraud. The government is lying to us all.”

                      D 1 Reply Last reply
                      0
                      • M Member 96

                        In my experience test with English, Thai, Chinese and that European language where two a's in a row are considered a different character (I forget, Norwegian or Swedish or something like that) if you want to be sure. Otherwise just test for each market you're after.


                        "Creating your own blog is about as easy as creating your own urine, and you're about as likely to find someone else interested in it." -- Lore Sjöberg

                        J Offline
                        J Offline
                        Johann Gerell
                        wrote on last edited by
                        #15

                        Ä: http://en.wikipedia.org/wiki/Ä[^] Ö: http://en.wikipedia.org/wiki/Ö[^]

                        -- Time you enjoy wasting is not wasted time - Bertrand Russel

                        M 1 Reply Last reply
                        0
                        • H Henry Minute

                          Every time this sort of topic comes up in The Lounge, one member (can't remember his name for the moment). pops up and avers that Turkish will trip it up. Where is he now that you need him? Anyway, I'd add it to the list of test languages, as well.

                          Henry Minute Do not read medical books! You could die of a misprint. - Mark Twain Girl: (staring) "Why do you need an icy cucumber?" “I want to report a fraud. The government is lying to us all.”

                          D Offline
                          D Offline
                          Dan Neely
                          wrote on last edited by
                          #16

                          Maunder posted it to subtle bugs recently... http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html[^]

                          Today's lesson is brought to you by the word "niggardly". Remember kids, don't attribute to racism what can be explained by Scandinavian language roots. -- Robert Royall

                          1 Reply Last reply
                          0
                          • T Tad McClellan

                            So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

                            TadMcClellan.Com

                            A Offline
                            A Offline
                            Alan Balkany
                            wrote on last edited by
                            #17

                            Another reason to test for multiple languages is that in some languages, some phrases get much longer than you'd expect, and this can throw off the layout of your GUI. If you're using the UTF-16 encoding (so all characters are a constant two bytes), you'll be limited to characters in the Basic Multilingual Plane, which has most modern languages and the more common Asian characters. If so, make sure it has all the Chinese characters you'll need.

                            1 Reply Last reply
                            0
                            • T Tad McClellan

                              So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

                              TadMcClellan.Com

                              P Offline
                              P Offline
                              pg az
                              wrote on last edited by
                              #18

                              With e.g. a hex editor you can create say a UNICODE file without the BOM, which Notepad at least is smart enough to recognize as UNICODE-little-endian, since the odd-bytes are uniformly zero. Doing a "Save As" into UNICODE from Notepad, it likes to insert the BOM, which seems inelegant to me since "it's not really a character", if you roll-your-own parsing routines they need to know to skip over the BOM. I wonder out-there in the wide world, do real Foreign-Language-Files normally have BOM's or not, offhand I would tend to guess they DO but that could of course be completely wrong.

                              pg--az

                              B 1 Reply Last reply
                              0
                              • J Johann Gerell

                                Ä: http://en.wikipedia.org/wiki/Ä[^] Ö: http://en.wikipedia.org/wiki/Ö[^]

                                -- Time you enjoy wasting is not wasted time - Bertrand Russel

                                M Offline
                                M Offline
                                Member 96
                                wrote on last edited by
                                #19

                                Yeah something like that. It bit me in the ass years ago because all our tables in our app are named starting with an A to be distinctive and the queries threw errors on some computers and we traced it down to tables that start with AA were interpreted as that A with the dots above it.


                                "Creating your own blog is about as easy as creating your own urine, and you're about as likely to find someone else interested in it." -- Lore Sjöberg

                                B 1 Reply Last reply
                                0
                                • T Tad McClellan

                                  Just unicode. The point is that If Chinese works then English will work. Thats what happens when non-techinical people start making decisions about the technology.

                                  TadMcClellan.Com

                                  F Offline
                                  F Offline
                                  Fabio Franco
                                  wrote on last edited by
                                  #20

                                  Tad McClellan wrote:

                                  Thats what happens when non-techinical people start making decisions

                                  Oh boy, I know the feeling. And I hate it. Recently when this special non-techy manager started making some sense-less decisions I strugled not to scream to him: "Why don't you build the f#@$@% system yourslef then?":mad:

                                  1 Reply Last reply
                                  0
                                  • T Tad McClellan

                                    So I just got out of a meeting where the testers said it would take longer to test the product because they needed to test it once with English and once with Chinese characters to make sure it was unicode compliant. When I told them that if they just test it with Chinese they should be fine. They objected to that idea because then how would you know it works for English!!! Then someone had the bright idea that well maybe we need to test for more then one language to make sure its unicode so lets through Japanese and Korean in the mix. Seriously I feel like I'm stuck in a Dilbert cartoon.

                                    TadMcClellan.Com

                                    T Offline
                                    T Offline
                                    Trevortni
                                    wrote on last edited by
                                    #21

                                    I guess my question would be whether it's supposed to support all these different languages, or just be Unicode-compliant. If it's supposed to support different languages, it should be tested under all the languages it's supposed to support (which would actually be a translation issue, not a programming issue); otherwise, well, you already made your point.

                                    1 Reply Last reply
                                    0
                                    • M Member 96

                                      Yeah something like that. It bit me in the ass years ago because all our tables in our app are named starting with an A to be distinctive and the queries threw errors on some computers and we traced it down to tables that start with AA were interpreted as that A with the dots above it.


                                      "Creating your own blog is about as easy as creating your own urine, and you're about as likely to find someone else interested in it." -- Lore Sjöberg

                                      B Offline
                                      B Offline
                                      bjarneds
                                      wrote on last edited by
                                      #22

                                      You are probably thinking about A with a ring (not dots, aka. umlaut) above it: http://en.wikipedia.org/wiki/Å[^]. The A with a ring is a different character, one that is used in several danish words. The old spelling of these words used the double AA instead of an A with a ring, but many names still use the double AA. Note that this letter (no matter if it is written as an A with a ring or a double AA) is the last character in the danish alphabet. This means that the result of sorting the strings "AA" and "BB" depends on the current culture. Of course, you shouldn't be required to know details like this when you are coding. Instead, you should assume nothing when it comes to cultures, characters, spelling etc. I think the MSDN article Writing Culture-Safe Managed Code (http://msdn.microsoft.com/en-us/library/ms994325.aspx[^]) may have a few surprises for most developers. So in my opinion, testing with different characters (and cultures) do make sense. Not only to make sure an application is Unicode compliant, but more importantly to catch some of the incorrect assumptions developers make about cultures etc.

                                      M 1 Reply Last reply
                                      0
                                      • P pg az

                                        With e.g. a hex editor you can create say a UNICODE file without the BOM, which Notepad at least is smart enough to recognize as UNICODE-little-endian, since the odd-bytes are uniformly zero. Doing a "Save As" into UNICODE from Notepad, it likes to insert the BOM, which seems inelegant to me since "it's not really a character", if you roll-your-own parsing routines they need to know to skip over the BOM. I wonder out-there in the wide world, do real Foreign-Language-Files normally have BOM's or not, offhand I would tend to guess they DO but that could of course be completely wrong.

                                        pg--az

                                        B Offline
                                        B Offline
                                        bjarneds
                                        wrote on last edited by
                                        #23

                                        pg--az wrote:

                                        "it's not really a character"

                                        Actually, the BOM (byte-order mark) is a real character, known as "zero-width no-break space". This is a good choice, because it makes no harm to programs that just need to display the content, even if they don't skip over it (zero-width = invisible, no-break = no undesired wrapping behaviour).

                                        1 Reply Last reply
                                        0
                                        • B bjarneds

                                          You are probably thinking about A with a ring (not dots, aka. umlaut) above it: http://en.wikipedia.org/wiki/Å[^]. The A with a ring is a different character, one that is used in several danish words. The old spelling of these words used the double AA instead of an A with a ring, but many names still use the double AA. Note that this letter (no matter if it is written as an A with a ring or a double AA) is the last character in the danish alphabet. This means that the result of sorting the strings "AA" and "BB" depends on the current culture. Of course, you shouldn't be required to know details like this when you are coding. Instead, you should assume nothing when it comes to cultures, characters, spelling etc. I think the MSDN article Writing Culture-Safe Managed Code (http://msdn.microsoft.com/en-us/library/ms994325.aspx[^]) may have a few surprises for most developers. So in my opinion, testing with different characters (and cultures) do make sense. Not only to make sure an application is Unicode compliant, but more importantly to catch some of the incorrect assumptions developers make about cultures etc.

                                          M Offline
                                          M Offline
                                          Member 96
                                          wrote on last edited by
                                          #24

                                          I agree, my code is fine, I've always adhered to Unicode standards however this problem was in the FireBird SQL drivers, I worked around it by ensuring that all my dynamic SQL had double CAPITAL a's.


                                          "Creating your own blog is about as easy as creating your own urine, and you're about as likely to find someone else interested in it." -- Lore Sjöberg

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups