Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. The Back Room
  4. ANSI to UNICODE

ANSI to UNICODE

Scheduled Pinned Locked Moved The Back Room
question
20 Posts 11 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Taka Muraoka

    Having just recently struggled through converting a large app from ANSI to Unicode, I found this quite funny. I don't feel so bad now :-):omg::wtf: Although I have to admit, it's quite inventive :rolleyes:


    "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

    K Offline
    K Offline
    KaRl
    wrote on last edited by
    #5

    I know I take some risks, especially in the Soapbox, and that it will look like a programming question, and it's indeed quiet borderline: What is the interest of using wide strings (LPWSTR) ? As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Some of the apps I work on use "char *" strings only, even when they deal with data written in Japanese, and there's no problem. The conversion is a heavy task, and moreover, Unicode applications are not directly Win95 compatible (sounds strange but some of our clients still use Win95). So, could you please explain me the interest of making such a translation? (For the hitters: please avoid the head, thanks in advance)


    Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

    T P N 3 Replies Last reply
    0
    • K KaRl

      I know I take some risks, especially in the Soapbox, and that it will look like a programming question, and it's indeed quiet borderline: What is the interest of using wide strings (LPWSTR) ? As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Some of the apps I work on use "char *" strings only, even when they deal with data written in Japanese, and there's no problem. The conversion is a heavy task, and moreover, Unicode applications are not directly Win95 compatible (sounds strange but some of our clients still use Win95). So, could you please explain me the interest of making such a translation? (For the hitters: please avoid the head, thanks in advance)


      Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

      T Offline
      T Offline
      Taka Muraoka
      wrote on last edited by
      #6

      KaЯl wrote: As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Unfortunately, it's not that easy. A lot of the API functions behave differently in Unicode builds. I wrote an article about my ordeal here[^] although admittedly, most of my problems were STL-related. Nevertheless, given that I am a heavy STL user, this was a major issue. I'm actually right this very second trying to convert my newly-Unicoded app to work on Windows 98 and Me - what a PITA :-( KaЯl wrote: So, could you please explain me the interest of making such a translation? Easy: we live in a globalized world. When writing software, you can no longer afford to assume that your clients work in English. Even if they are using single-byte character sets, there are issues if different people are using different code pages. Awasu is an RSS reader that accepts feeds from sites around the world - if it was ANSI, how does it handle a feed coming from Russia on a computer running in Greece? How do I show feed content from Spain and Persia ([edit]uh, Iran - I actually have a feed called BBC Persia :-O[/edit])on the same page? And what about search. How do you do that with feeds using different character sets? It becomes very complicated very quickly. It's much easier in the long run to do everything in Unicode. And of course, I can also now handle Japanese, Chinese, Korean, etc. feeds.


      "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

      K 1 Reply Last reply
      0
      • J James Pullicino

        Just got this code from a co-worker... LPWSTR Ansi2Unicode( LPWSTR lpszwString, LPCSTR lpszString, UINT uMaxLen ) { if( lpszwString == NULL || lpszString == NULL ) return NULL; WritePrivateProfileStringA( "aaSection", "aaEntry", lpszString, "WIN.INI" ); GetPrivateProfileStringW( L"aaSection", L"aaEntry", L"", lpszwString, uMaxLen, L"WIN.INI" ); // Clear the entry WritePrivateProfileStringA( "aaSection", NULL, "", "WIN.INI" ); return lpszwString; } What the hell is this shit? :confused: :confused: :mad:

        J Offline
        J Offline
        Jorgen Sigvardsson
        wrote on last edited by
        #7

        thought wrote: What the hell is this shit? It's analogous to having only a hammer in your toolbox.. (and only 2 braincells) :) -- You still have your old friend Zoidberg. You all have Zoidberg!

        1 Reply Last reply
        0
        • T Taka Muraoka

          KaЯl wrote: As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Unfortunately, it's not that easy. A lot of the API functions behave differently in Unicode builds. I wrote an article about my ordeal here[^] although admittedly, most of my problems were STL-related. Nevertheless, given that I am a heavy STL user, this was a major issue. I'm actually right this very second trying to convert my newly-Unicoded app to work on Windows 98 and Me - what a PITA :-( KaЯl wrote: So, could you please explain me the interest of making such a translation? Easy: we live in a globalized world. When writing software, you can no longer afford to assume that your clients work in English. Even if they are using single-byte character sets, there are issues if different people are using different code pages. Awasu is an RSS reader that accepts feeds from sites around the world - if it was ANSI, how does it handle a feed coming from Russia on a computer running in Greece? How do I show feed content from Spain and Persia ([edit]uh, Iran - I actually have a feed called BBC Persia :-O[/edit])on the same page? And what about search. How do you do that with feeds using different character sets? It becomes very complicated very quickly. It's much easier in the long run to do everything in Unicode. And of course, I can also now handle Japanese, Chinese, Korean, etc. feeds.


          "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

          K Offline
          K Offline
          KaRl
          wrote on last edited by
          #8

          Taka Muraoka wrote: When writing software, you can no longer afford to assume that your clients work in English I totally agree. But as I said, some of the apps I work on are translated in Japanese, enable to store and edit Japanese data, and they are not using Unicode! (that's also true none use C++ streams). That's why I'm so confused about the use of Unicode. Oh, BTW, good article! :-D


          Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

          T 1 Reply Last reply
          0
          • K KaRl

            Taka Muraoka wrote: When writing software, you can no longer afford to assume that your clients work in English I totally agree. But as I said, some of the apps I work on are translated in Japanese, enable to store and edit Japanese data, and they are not using Unicode! (that's also true none use C++ streams). That's why I'm so confused about the use of Unicode. Oh, BTW, good article! :-D


            Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

            T Offline
            T Offline
            Taka Muraoka
            wrote on last edited by
            #9

            KaЯl wrote: But as I said, some of the apps I work on are translated in Japanese, enable to store and edit Japanese data, and they are not using Unicode! That's fine, as long as you are careful. But how well would your code manage if it also had to handle single-byte Greek, Arabic and Spanish?


            "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

            K 1 Reply Last reply
            0
            • J James Pullicino

              Just got this code from a co-worker... LPWSTR Ansi2Unicode( LPWSTR lpszwString, LPCSTR lpszString, UINT uMaxLen ) { if( lpszwString == NULL || lpszString == NULL ) return NULL; WritePrivateProfileStringA( "aaSection", "aaEntry", lpszString, "WIN.INI" ); GetPrivateProfileStringW( L"aaSection", L"aaEntry", L"", lpszwString, uMaxLen, L"WIN.INI" ); // Clear the entry WritePrivateProfileStringA( "aaSection", NULL, "", "WIN.INI" ); return lpszwString; } What the hell is this shit? :confused: :confused: :mad:

              R Offline
              R Offline
              Rutger Ellen
              wrote on last edited by
              #10

              This is a plot from hardware manufacturers to sell us faster machines :);P

              1 Reply Last reply
              0
              • T Taka Muraoka

                KaЯl wrote: But as I said, some of the apps I work on are translated in Japanese, enable to store and edit Japanese data, and they are not using Unicode! That's fine, as long as you are careful. But how well would your code manage if it also had to handle single-byte Greek, Arabic and Spanish?


                "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

                K Offline
                K Offline
                KaRl
                wrote on last edited by
                #11

                The apps have the same code for the English, French and Japanese versions, just the resources change (with resource DLLs). And it runs also on W9x! ;)


                Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                T 1 Reply Last reply
                0
                • K KaRl

                  The apps have the same code for the English, French and Japanese versions, just the resources change (with resource DLLs). And it runs also on W9x! ;)


                  Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                  T Offline
                  T Offline
                  Taka Muraoka
                  wrote on last edited by
                  #12

                  You're talking about something different then: localization. For that, converting to Unicode is not really necessary. I'm talking about converting an app to manipulate it's data as Unicode.


                  "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

                  K 1 Reply Last reply
                  0
                  • T Taka Muraoka

                    You're talking about something different then: localization. For that, converting to Unicode is not really necessary. I'm talking about converting an app to manipulate it's data as Unicode.


                    "Sucks less" isn't progress - Kent Beck [^] Awasu 1.1.2 [^]: A free RSS reader with support for Code Project.

                    K Offline
                    K Offline
                    KaRl
                    wrote on last edited by
                    #13

                    The data manipulated by the apps are also "localized", it's not just about the graphical interface. For example, Japanese can enter names with characters from the Japanese Character Set, and English can enter ANSI characters. :eek: I'm lost and confused! :omg:


                    Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                    S 1 Reply Last reply
                    0
                    • J James Pullicino

                      Just got this code from a co-worker... LPWSTR Ansi2Unicode( LPWSTR lpszwString, LPCSTR lpszString, UINT uMaxLen ) { if( lpszwString == NULL || lpszString == NULL ) return NULL; WritePrivateProfileStringA( "aaSection", "aaEntry", lpszString, "WIN.INI" ); GetPrivateProfileStringW( L"aaSection", L"aaEntry", L"", lpszwString, uMaxLen, L"WIN.INI" ); // Clear the entry WritePrivateProfileStringA( "aaSection", NULL, "", "WIN.INI" ); return lpszwString; } What the hell is this shit? :confused: :confused: :mad:

                      R Offline
                      R Offline
                      RChin
                      wrote on last edited by
                      #14

                      :):):):) I can tell your annoyance, but this is soo funny, I had to laugh. Instead of looking at this as bad code, think of it as a clever (however ridiculous) and transpositional way of solving a problem he probably had no immediate knowledge of Look at it as a sign of thinking outside the loop I have to laugh again! :-D:-D:laugh::laugh: **I Dream of Absolute Zero


                      **

                      1 Reply Last reply
                      0
                      • K KaRl

                        The data manipulated by the apps are also "localized", it's not just about the graphical interface. For example, Japanese can enter names with characters from the Japanese Character Set, and English can enter ANSI characters. :eek: I'm lost and confused! :omg:


                        Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                        S Offline
                        S Offline
                        Shog9 0
                        wrote on last edited by
                        #15

                        Not an expert on this by any means, but: 1) If you're talking about MBCS, then yeah, it works - it's a royal PITA, but it works... until you hit a module that doesn't care about it, at which point it breaks badly as soon as you hit those odd characters. With wide chars, stuff either works or it breaks. 2) If you're talking about localization, the classic method for implementing this (all strings in resource DLLs, alternate DLLs installed depending on machine locale), then what happens when you get an English user wanting to generate reports for his French customers? Your simple plan now must be extended to support loading multiple translations, and things get hairy.

                        Shog9

                        I returned and saw under the sun, that the race is not to the swift, nor the battle to the strong...

                        K 1 Reply Last reply
                        0
                        • S Shog9 0

                          Not an expert on this by any means, but: 1) If you're talking about MBCS, then yeah, it works - it's a royal PITA, but it works... until you hit a module that doesn't care about it, at which point it breaks badly as soon as you hit those odd characters. With wide chars, stuff either works or it breaks. 2) If you're talking about localization, the classic method for implementing this (all strings in resource DLLs, alternate DLLs installed depending on machine locale), then what happens when you get an English user wanting to generate reports for his French customers? Your simple plan now must be extended to support loading multiple translations, and things get hairy.

                          Shog9

                          I returned and saw under the sun, that the race is not to the swift, nor the battle to the strong...

                          K Offline
                          K Offline
                          KaRl
                          wrote on last edited by
                          #16
                          1. Yep, we had to take care to all the modules, to avoid the mess. Thanks God we have the hand of the ones managing the data through all the process. 2) a- Generally, an english user considers that the entire World should speak english, so he/she doesn't care about the other languages. b- Yep, it's full of hairs everywhere. But as long as the different translations use alphabets, parameterization (static or dynamic, regular expressions or XML) does the trick. It's become however really hot with asian languages. I can't be more precise on the subject, for the moment I don't have to manage this part of the product :cool:

                          Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                          S 1 Reply Last reply
                          0
                          • K KaRl
                            1. Yep, we had to take care to all the modules, to avoid the mess. Thanks God we have the hand of the ones managing the data through all the process. 2) a- Generally, an english user considers that the entire World should speak english, so he/she doesn't care about the other languages. b- Yep, it's full of hairs everywhere. But as long as the different translations use alphabets, parameterization (static or dynamic, regular expressions or XML) does the trick. It's become however really hot with asian languages. I can't be more precise on the subject, for the moment I don't have to manage this part of the product :cool:

                            Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                            S Offline
                            S Offline
                            Shog9 0
                            wrote on last edited by
                            #17

                            KaЯl wrote: a- Generally, an english user considers that the entire World should speak english, so he/she doesn't care about the other languages. This is less the case, however, when trying to sell them very expensive machinery. ;)

                            Shog9

                            I returned and saw under the sun, that the race is not to the swift, nor the battle to the strong...

                            1 Reply Last reply
                            0
                            • K KaRl

                              I know I take some risks, especially in the Soapbox, and that it will look like a programming question, and it's indeed quiet borderline: What is the interest of using wide strings (LPWSTR) ? As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Some of the apps I work on use "char *" strings only, even when they deal with data written in Japanese, and there's no problem. The conversion is a heavy task, and moreover, Unicode applications are not directly Win95 compatible (sounds strange but some of our clients still use Win95). So, could you please explain me the interest of making such a translation? (For the hitters: please avoid the head, thanks in advance)


                              Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                              P Offline
                              P Offline
                              peterchen
                              wrote on last edited by
                              #18

                              When I think of MBCS I always think of the thing that killed Tasha Yar.


                              "Dor säggsische Dialeggt eechnet sich wie keeen onderor für den Ausdrugg zäärdlischor Gefiehle."
                              sighist | Agile Programming | doxygen

                              1 Reply Last reply
                              0
                              • K KaRl

                                I know I take some risks, especially in the Soapbox, and that it will look like a programming question, and it's indeed quiet borderline: What is the interest of using wide strings (LPWSTR) ? As long as I take care that a character may not be a single byte, and that the length of the string is not its size, I don't see the problem to store a "Unicode-coded" string in a LPSTR? Some of the apps I work on use "char *" strings only, even when they deal with data written in Japanese, and there's no problem. The conversion is a heavy task, and moreover, Unicode applications are not directly Win95 compatible (sounds strange but some of our clients still use Win95). So, could you please explain me the interest of making such a translation? (For the hitters: please avoid the head, thanks in advance)


                                Every gun that is made, every warship launched, every rocket fired, signifies in the final sense a theft from those who hunger and are not fed, those who are cold and are not clothed - Dwight D. Eisenhower

                                N Offline
                                N Offline
                                Navin
                                wrote on last edited by
                                #19

                                There are advantages and disadvantages. But mostly advantages. :) I like Unicode because: :bob: It's cleaner than MBCS/DBCS - each character really is a character. You don't have to worry about code pages and charsets, although you may have to ensure you use a font that contains all the characters you are going to show. :bob: We store all our translations in text files (resouce DLLs are a royal PITA). If they are Unicode, you can see text correctly even on English systems. :bob: WinNT and its bretheren use Unicode strings internally, so you probably get a performance hit when you don't pass in Unicode strings to APIs. I have seen some Windows 2000 and beyond APIs that ONLY take Unicode - there is no Ascii version at all for some APIs. :bob: AFAIK, C# is all Unicode, I am guessing .NET works on Unicode internally. :bob: You can get Unicode to run on Win95/98/Me! It requires Unicows.dll. Also, you can have partial Unicoe support - I have a text file class that automatically checks to see if the file is Unicode, and if so, converts it to Ascii as it reads it in. Although I have run into difficulties with some (more obscure) APIs... but I think it is possible to override functions in Unicows.dll to get them to work right. If your nose runs and your feet smell, then you're built upside down.

                                1 Reply Last reply
                                0
                                • J James Pullicino

                                  Just got this code from a co-worker... LPWSTR Ansi2Unicode( LPWSTR lpszwString, LPCSTR lpszString, UINT uMaxLen ) { if( lpszwString == NULL || lpszString == NULL ) return NULL; WritePrivateProfileStringA( "aaSection", "aaEntry", lpszString, "WIN.INI" ); GetPrivateProfileStringW( L"aaSection", L"aaEntry", L"", lpszwString, uMaxLen, L"WIN.INI" ); // Clear the entry WritePrivateProfileStringA( "aaSection", NULL, "", "WIN.INI" ); return lpszwString; } What the hell is this shit? :confused: :confused: :mad:

                                  G Offline
                                  G Offline
                                  Gary R Wheeler
                                  wrote on last edited by
                                  #20

                                  :wtf:


                                  Software Zen: delete this;

                                  1 Reply Last reply
                                  0
                                  Reply
                                  • Reply as topic
                                  Log in to reply
                                  • Oldest to Newest
                                  • Newest to Oldest
                                  • Most Votes


                                  • Login

                                  • Don't have an account? Register

                                  • Login or register to search.
                                  • First post
                                    Last post
                                  0
                                  • Categories
                                  • Recent
                                  • Tags
                                  • Popular
                                  • World
                                  • Users
                                  • Groups