Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. CString to UTF-8

CString to UTF-8

Scheduled Pinned Locked Moved C / C++ / MFC
question
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    Rupel
    wrote on last edited by
    #1

    hi all, i did never care that much about unicode, WCHAR, T_(), UTF-X and all these character-width things. but now someone wants to have an UTF-8-encoded string from my application. so now i'm sitting here for hours reading FAQs and postings and still haven't found the CString::ToUtf8(/*out*/ BYTE *pUtf8String, /*out*/ long &nSize) function. ;) where is it? for real: what's the easiest way to just get that CString i have into UTF-8? (my app is compiled without _UNICODE - so the chars in the CString are all 8 bit wide). tia :wq

    D M 2 Replies Last reply
    0
    • R Rupel

      hi all, i did never care that much about unicode, WCHAR, T_(), UTF-X and all these character-width things. but now someone wants to have an UTF-8-encoded string from my application. so now i'm sitting here for hours reading FAQs and postings and still haven't found the CString::ToUtf8(/*out*/ BYTE *pUtf8String, /*out*/ long &nSize) function. ;) where is it? for real: what's the easiest way to just get that CString i have into UTF-8? (my app is compiled without _UNICODE - so the chars in the CString are all 8 bit wide). tia :wq

      D Offline
      D Offline
      Diddy
      wrote on last edited by
      #2

      I belive you are back to Win32 here. Use MultiByteToWideChar to convert the string, specifying CP_UTF8 as the code page. Store the result in a WCHAR array and from that point on I think you are without your faitful CString, in .NET there is a CStringW and CStringA that you can explicatly use or a normal CString which is set depending on the _UNICODE define, but not in VC6. You could always try STL - string and wstring.

      R 1 Reply Last reply
      0
      • D Diddy

        I belive you are back to Win32 here. Use MultiByteToWideChar to convert the string, specifying CP_UTF8 as the code page. Store the result in a WCHAR array and from that point on I think you are without your faitful CString, in .NET there is a CStringW and CStringA that you can explicatly use or a normal CString which is set depending on the _UNICODE define, but not in VC6. You could always try STL - string and wstring.

        R Offline
        R Offline
        Rupel
        wrote on last edited by
        #3

        but MultiByteToWideChar() doesn't construct a UTF-8 encoded string, does it? afaik the point in UTF-8 is that characters < 0x7F are still only 8bit wide. btw: i'm on VS.NET 2003 if that makes any difference. :wq

        1 Reply Last reply
        0
        • R Rupel

          hi all, i did never care that much about unicode, WCHAR, T_(), UTF-X and all these character-width things. but now someone wants to have an UTF-8-encoded string from my application. so now i'm sitting here for hours reading FAQs and postings and still haven't found the CString::ToUtf8(/*out*/ BYTE *pUtf8String, /*out*/ long &nSize) function. ;) where is it? for real: what's the easiest way to just get that CString i have into UTF-8? (my app is compiled without _UNICODE - so the chars in the CString are all 8 bit wide). tia :wq

          M Offline
          M Offline
          Mike Dimmick
          wrote on last edited by
          #4

          // First, get UTF-16 version of the string
          MultiByteToWideChar(CP_ACP, /* etc */ );
          // Now convert to UTF-8
          WideCharToMultiByte(CP_UTF8, /* etc */ );

          The first converts the code page your thread is running on into UTF-16 (2 byte Unicode encoding). This is necessary to get the canonical Unicode values. The second performs the fairly simple transform from UTF-16 to UTF-8. Of course, if you're sure you'll only ever be using the ASCII character set, you already have UTF-8, since the first 128 characters are a direct map. Since you have an accent in your name, though - ü is ANSI code 252 (and also Unicode U+00FC) - I assume you'll need the proper technique.

          R 1 Reply Last reply
          0
          • M Mike Dimmick

            // First, get UTF-16 version of the string
            MultiByteToWideChar(CP_ACP, /* etc */ );
            // Now convert to UTF-8
            WideCharToMultiByte(CP_UTF8, /* etc */ );

            The first converts the code page your thread is running on into UTF-16 (2 byte Unicode encoding). This is necessary to get the canonical Unicode values. The second performs the fairly simple transform from UTF-16 to UTF-8. Of course, if you're sure you'll only ever be using the ASCII character set, you already have UTF-8, since the first 128 characters are a direct map. Since you have an accent in your name, though - ü is ANSI code 252 (and also Unicode U+00FC) - I assume you'll need the proper technique.

            R Offline
            R Offline
            Rupel
            wrote on last edited by
            #5

            yup thx. i just figured out how to use Wide...()

            CStringW str("äö€èéß");
            long size = str.GetLength()*6+1;
            char *out = new char[size];
            if (!WideCharToMultiByte(CP_UTF8,0,str,-1,out,size,NULL,NULL))
            {
            DWORD err = GetLastError();
            }
            delete [] out;

            *6 is a bit exaggerated, but you never know - memory is cheap nowadays ;) i'll use that multitowide-function too - thx for your reply. things look way smoother now ;) :wq

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups