Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. bstr, wchar_t, and code pages

bstr, wchar_t, and code pages

Scheduled Pinned Locked Moved C / C++ / MFC
tutorialquestion
6 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    Samsung
    wrote on last edited by
    #1

    Hello, I have text in bstr string. Text is in some code page. How to convert text from code page to UTF8 and put it in wchar_t?

    M 1 Reply Last reply
    0
    • S Samsung

      Hello, I have text in bstr string. Text is in some code page. How to convert text from code page to UTF8 and put it in wchar_t?

      M Offline
      M Offline
      Michael Dunn
      wrote on last edited by
      #2

      A BSTR holds a Unicode string, which has no code pages. You can use WideCharToMultiByte() with CP_UTF8 as the first parameter to convert it to UTF-8, but the destination is a char array, not wchar_t. --Mike-- LINKS~! Ericahist | 1ClickPicGrabber | CP SearchBar v2.0.2 | C++ Forum FAQ | You Are Dumb Magnae clunes mihi placent, nec possum de hac re mentiri.

      S 1 Reply Last reply
      0
      • M Michael Dunn

        A BSTR holds a Unicode string, which has no code pages. You can use WideCharToMultiByte() with CP_UTF8 as the first parameter to convert it to UTF-8, but the destination is a char array, not wchar_t. --Mike-- LINKS~! Ericahist | 1ClickPicGrabber | CP SearchBar v2.0.2 | C++ Forum FAQ | You Are Dumb Magnae clunes mihi placent, nec possum de hac re mentiri.

        S Offline
        S Offline
        Samsung
        wrote on last edited by
        #3

        Michael Dunn wrote: A BSTR holds a Unicode string, which has no code pages. HTML page, from IE, is taken in BSTR. What about charset which is defined in HTML? Does it mean no need to convert from the charset to UTF8?

        M 1 Reply Last reply
        0
        • S Samsung

          Michael Dunn wrote: A BSTR holds a Unicode string, which has no code pages. HTML page, from IE, is taken in BSTR. What about charset which is defined in HTML? Does it mean no need to convert from the charset to UTF8?

          M Offline
          M Offline
          Michael Dunn
          wrote on last edited by
          #4

          When IE reads the HTML, it handles the encoding itself. When you get the HTML via a COM method, it's returned as a BSTR (using UCS-2 encoding) because that's how strings are passed around in COM. So you need to be clear about what you want. If you want to change that BSTR to UTF-8, see my previous answer. --Mike-- LINKS~! Ericahist | 1ClickPicGrabber | CP SearchBar v2.0.2 | C++ Forum FAQ | You Are Dumb Magnae clunes mihi placent, nec possum de hac re mentiri.

          S 1 Reply Last reply
          0
          • M Michael Dunn

            When IE reads the HTML, it handles the encoding itself. When you get the HTML via a COM method, it's returned as a BSTR (using UCS-2 encoding) because that's how strings are passed around in COM. So you need to be clear about what you want. If you want to change that BSTR to UTF-8, see my previous answer. --Mike-- LINKS~! Ericahist | 1ClickPicGrabber | CP SearchBar v2.0.2 | C++ Forum FAQ | You Are Dumb Magnae clunes mihi placent, nec possum de hac re mentiri.

            S Offline
            S Offline
            Samsung
            wrote on last edited by
            #5

            I want UTF8 in wchar_t. Thank you.

            M 1 Reply Last reply
            0
            • S Samsung

              I want UTF8 in wchar_t. Thank you.

              M Offline
              M Offline
              Michael Dunn
              wrote on last edited by
              #6

              You can't store UTF-8 in a wchar_t array because UTF-8 is a byte-oriented encoding. --Mike-- LINKS~! Ericahist | 1ClickPicGrabber | CP SearchBar v2.0.2 | C++ Forum FAQ | You Are Dumb Magnae clunes mihi placent, nec possum de hac re mentiri.

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups