Converting a wstring to UTF8?
-
OK, following on from the char -> wchar_t question earlier, does anyone know of a neat STL-friendly way to perform UTF8 encoding/decoding? Currently I use the MS character encoding macros, e.g.:
// Convert UTF8 string to Unicode
wstring str = CA2W(utf8_string, CP_UTF8).m_psz;But I'd like to use something that is a bit more platform independent! :)
The Rob Blog
Google Talk: robert.caldecott -
OK, following on from the char -> wchar_t question earlier, does anyone know of a neat STL-friendly way to perform UTF8 encoding/decoding? Currently I use the MS character encoding macros, e.g.:
// Convert UTF8 string to Unicode
wstring str = CA2W(utf8_string, CP_UTF8).m_psz;But I'd like to use something that is a bit more platform independent! :)
The Rob Blog
Google Talk: robert.caldecottAFAIK STL doesn't have UTF-8 support because the elements in a
basic_string
have to all be the same size, and the UTF-8 encodings of characters have variying sizes.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | NEW!! PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
-
AFAIK STL doesn't have UTF-8 support because the elements in a
basic_string
have to all be the same size, and the UTF-8 encodings of characters have variying sizes.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | NEW!! PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
Never mind. You can store a utf-8 string as a sequence of bytes in std::string, just need to know which functions operate correctly on such a sequence and which not. As for the conversion, I have been writing an article on platform-independent STL friendly utf-8 string operations for months, but I just can't make myself finish it :^)
My programming blahblahblah blog. If you ever find anything useful here, please let me know to remove it.
-
OK, following on from the char -> wchar_t question earlier, does anyone know of a neat STL-friendly way to perform UTF8 encoding/decoding? Currently I use the MS character encoding macros, e.g.:
// Convert UTF8 string to Unicode
wstring str = CA2W(utf8_string, CP_UTF8).m_psz;But I'd like to use something that is a bit more platform independent! :)
The Rob Blog
Google Talk: robert.caldecottIn the Microsoft stl libraries, localization uses C interface. The locale class calls setlocale function. And it cannot work with UTF-8. From the Remarks section on the setlocale function reference: "The set of available languages, country/region codes, and code pages includes all those supported by the Win32 NLS API (except code pages that require more than two bytes per character, like UTF-8)."
-
Never mind. You can store a utf-8 string as a sequence of bytes in std::string, just need to know which functions operate correctly on such a sequence and which not. As for the conversion, I have been writing an article on platform-independent STL friendly utf-8 string operations for months, but I just can't make myself finish it :^)
My programming blahblahblah blog. If you ever find anything useful here, please let me know to remove it.
Nemanja Trifunovic wrote:
As for the conversion, I have been writing an article on platform-independent STL friendly utf-8 string operations for months, but I just can't make myself finish it
Man, that would be sweet... :) :)
The Rob Blog
Google Talk: robert.caldecott