Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. DBCS issue

DBCS issue

Scheduled Pinned Locked Moved C / C++ / MFC
helptutorialquestion
9 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    David Crow
    wrote on last edited by
    #1

    When our application is used on a Japanese system, we are encountering a DBCS problem. When entering text into one of the edit controls, there may be several instances of the shift-out/shift-in pair intermingled with "normal" text. For example: SoXXXXXXSiXSoXXXXXXSi When I call GetTextLength() on this text, it returns 13. If this text is saved to a text file (e.g., using Notepad since it's all I can understand on a Japanese system), the size is reported as 13 (I guess the 2 SoSi pair did not get saved). Now for the problem: when I send this text and its length (13) to the AS/400 system for processing, it complains about a mismatched SoSi pair. Debugging on the AS/400 end, we can change the length to 17 and it works fine. I've no clue how to handle this. If the above example were doubled, then GetTextLength() would return 26, yet we'd have to change the value to 34 in order for it to work. Any clues? - DC

    "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

    "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

    G D 2 Replies Last reply
    0
    • D David Crow

      When our application is used on a Japanese system, we are encountering a DBCS problem. When entering text into one of the edit controls, there may be several instances of the shift-out/shift-in pair intermingled with "normal" text. For example: SoXXXXXXSiXSoXXXXXXSi When I call GetTextLength() on this text, it returns 13. If this text is saved to a text file (e.g., using Notepad since it's all I can understand on a Japanese system), the size is reported as 13 (I guess the 2 SoSi pair did not get saved). Now for the problem: when I send this text and its length (13) to the AS/400 system for processing, it complains about a mismatched SoSi pair. Debugging on the AS/400 end, we can change the length to 17 and it works fine. I've no clue how to handle this. If the above example were doubled, then GetTextLength() would return 26, yet we'd have to change the value to 34 in order for it to work. Any clues? - DC

      "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

      "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

      G Offline
      G Offline
      Graham Bradshaw
      wrote on last edited by
      #2

      GetTextLength() is returning the length in characters. How are you determining the number of bytes to write to the file? For DBCS, length in characters is not the same as length in bytes.

      DavidCrow wrote:

      Any clues?

      You've considered using Unicode, I assume? If not, you may have to write your own strlen, hunting from the start to a terminating zero, to get the number of bytes.

      D 1 Reply Last reply
      0
      • G Graham Bradshaw

        GetTextLength() is returning the length in characters. How are you determining the number of bytes to write to the file? For DBCS, length in characters is not the same as length in bytes.

        DavidCrow wrote:

        Any clues?

        You've considered using Unicode, I assume? If not, you may have to write your own strlen, hunting from the start to a terminating zero, to get the number of bytes.

        D Offline
        D Offline
        David Crow
        wrote on last edited by
        #3

        Graham Bradshaw wrote:

        How are you determining the number of bytes to write to the file?

        I'm not. I just paste the text into Notepad and save it.

        Graham Bradshaw wrote:

        You've considered using Unicode, I assume?

        Yes, but I've read nothing thus far that says it would solve the problem. The application is 10+ years old so retooling it for Unicode would be no small undertaking.

        "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

        "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

        G 1 Reply Last reply
        0
        • D David Crow

          Graham Bradshaw wrote:

          How are you determining the number of bytes to write to the file?

          I'm not. I just paste the text into Notepad and save it.

          Graham Bradshaw wrote:

          You've considered using Unicode, I assume?

          Yes, but I've read nothing thus far that says it would solve the problem. The application is 10+ years old so retooling it for Unicode would be no small undertaking.

          "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

          "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

          G Offline
          G Offline
          Graham Bradshaw
          wrote on last edited by
          #4

          DavidCrow wrote:

          save it.

          Save it how? Which encoding did you select?

          D 1 Reply Last reply
          0
          • G Graham Bradshaw

            DavidCrow wrote:

            save it.

            Save it how? Which encoding did you select?

            D Offline
            D Offline
            David Crow
            wrote on last edited by
            #5

            Graham Bradshaw wrote:

            Save it how?

            The Save option from the File menu.

            Graham Bradshaw wrote:

            Which encoding did you select?

            Once as ANSI (13 bytes) and another as UTF-8 (22 bytes).

            "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

            "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

            G 1 Reply Last reply
            0
            • D David Crow

              Graham Bradshaw wrote:

              Save it how?

              The Save option from the File menu.

              Graham Bradshaw wrote:

              Which encoding did you select?

              Once as ANSI (13 bytes) and another as UTF-8 (22 bytes).

              "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

              "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

              G Offline
              G Offline
              Graham Bradshaw
              wrote on last edited by
              #6

              So the text isn't 13 bytes long. It's more than that (UTF-8 encoding is not the same as DCBS, so the length of the ex-Notepad file is not relevant).

              DavidCrow wrote:

              when I send this text and its length (13) to the AS/400 system for processing, it complains about a mismatched SoSi pair.

              And that 13 is surely the problem. You need to send the length in bytes to the AS/400, together with all the text, not the length in characters. This assumes, of course, that the AS/400 understands DBCS encoding.

              D 1 Reply Last reply
              0
              • G Graham Bradshaw

                So the text isn't 13 bytes long. It's more than that (UTF-8 encoding is not the same as DCBS, so the length of the ex-Notepad file is not relevant).

                DavidCrow wrote:

                when I send this text and its length (13) to the AS/400 system for processing, it complains about a mismatched SoSi pair.

                And that 13 is surely the problem. You need to send the length in bytes to the AS/400, together with all the text, not the length in characters. This assumes, of course, that the AS/400 understands DBCS encoding.

                D Offline
                D Offline
                David Crow
                wrote on last edited by
                #7

                Graham Bradshaw wrote:

                You need to send the length in bytes to the AS/400, together with all the text, not the length in characters.

                How do I go about doing this (since WM_GETTEXTLENGTH is giving me the latter)?

                Graham Bradshaw wrote:

                This assumes, of course, that the AS/400 understands DBCS encoding.

                It does. That's why it works (i.e., no data is lost) when I manually change the text length during debugging.

                "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

                G 1 Reply Last reply
                0
                • D David Crow

                  Graham Bradshaw wrote:

                  You need to send the length in bytes to the AS/400, together with all the text, not the length in characters.

                  How do I go about doing this (since WM_GETTEXTLENGTH is giving me the latter)?

                  Graham Bradshaw wrote:

                  This assumes, of course, that the AS/400 understands DBCS encoding.

                  It does. That's why it works (i.e., no data is lost) when I manually change the text length during debugging.

                  "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                  "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

                  G Offline
                  G Offline
                  Graham Bradshaw
                  wrote on last edited by
                  #8

                  DavidCrow wrote:

                  How do I go about doing this (since WM_GETTEXTLENGTH is giving me the latter)?

                  You're working in C++? If so, you must have a pointer to the start of the character buffer, so you can send it to the AS/400. Just hunt through byte by byte until you hit a zero, counting as you go.

                  1 Reply Last reply
                  0
                  • D David Crow

                    When our application is used on a Japanese system, we are encountering a DBCS problem. When entering text into one of the edit controls, there may be several instances of the shift-out/shift-in pair intermingled with "normal" text. For example: SoXXXXXXSiXSoXXXXXXSi When I call GetTextLength() on this text, it returns 13. If this text is saved to a text file (e.g., using Notepad since it's all I can understand on a Japanese system), the size is reported as 13 (I guess the 2 SoSi pair did not get saved). Now for the problem: when I send this text and its length (13) to the AS/400 system for processing, it complains about a mismatched SoSi pair. Debugging on the AS/400 end, we can change the length to 17 and it works fine. I've no clue how to handle this. If the above example were doubled, then GetTextLength() would return 26, yet we'd have to change the value to 34 in order for it to work. Any clues? - DC

                    "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                    "The brick walls are there for a reason...to stop the people who don't want it badly enough." - Randy Pausch

                    D Offline
                    D Offline
                    daniel_zy
                    wrote on last edited by
                    #9

                    The only way I know to calculate the number of bytes of DBCS string in a given encoding is: 1. Translate the string to Unicode (MultiByteToWideChar) 2. Translate the Unicode string to the previous encoding (WideCharToMultiByte) this function return the size of DBCS string in bytes. This work (in all the case that I know of) :) . Good luck.

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups