CString to UTF8 conversion
-
I am a C# developer, recently I got to do with a VC++ 5.0 project, I stuck up with the following situation, Saving a string(CString) to file which should be saved as UTF-8 encoded.
void CPrintMakerDlg::SaveHTML(LPCSTR fileName,CString fileContent) { ofstream iofile; try { LPCSTR str = fileContent.GetBuffer(fileContent.GetLength()); iofile.open(fileName, ios::in | ios::trunc |ios::binary ); iofile.write(str,fileContent.GetLength()); } catch(...) { AfxMessageBox("The problem is on saving file"); } iofile.close(); }
I heard the CString will be using internally multi-byte character, So I expecting the file format also should be the UTF8 format but its saving as a ASCII. Thanks, Vythees -
I am a C# developer, recently I got to do with a VC++ 5.0 project, I stuck up with the following situation, Saving a string(CString) to file which should be saved as UTF-8 encoded.
void CPrintMakerDlg::SaveHTML(LPCSTR fileName,CString fileContent) { ofstream iofile; try { LPCSTR str = fileContent.GetBuffer(fileContent.GetLength()); iofile.open(fileName, ios::in | ios::trunc |ios::binary ); iofile.write(str,fileContent.GetLength()); } catch(...) { AfxMessageBox("The problem is on saving file"); } iofile.close(); }
I heard the CString will be using internally multi-byte character, So I expecting the file format also should be the UTF8 format but its saving as a ASCII. Thanks, Vytheesvytheeswaran wrote:
I heard the CString will be using internally multi-byte character, So I expecting the file format also should be the UTF8 format
MBCS and UTF-8 are two different beasts. If your machine is using a loaded code page like Latin-1, I don't believe MBCS will contain any lead bytes that would lead to a double byte sequence anyway. The other western encodings might have them but I've never seen one in action in the US.
vytheeswaran wrote:
but its saving as a ASCII
Are you sure it's limiting output to ASCII. It's more likely what some refer to as ANSI. Try outputing a character above index 127. Did the output resemble what MSDN lists for ANSI? Character sets[^] Transformation formats[^] UTF[^]
-
vytheeswaran wrote:
I heard the CString will be using internally multi-byte character, So I expecting the file format also should be the UTF8 format
MBCS and UTF-8 are two different beasts. If your machine is using a loaded code page like Latin-1, I don't believe MBCS will contain any lead bytes that would lead to a double byte sequence anyway. The other western encodings might have them but I've never seen one in action in the US.
vytheeswaran wrote:
but its saving as a ASCII
Are you sure it's limiting output to ASCII. It's more likely what some refer to as ANSI. Try outputing a character above index 127. Did the output resemble what MSDN lists for ANSI? Character sets[^] Transformation formats[^] UTF[^]