Unicode to Hex conversion
-
Hi everyone. I need to convert unicode text to Hex values. For example:
символ = %F1%E8%EC%E2%EE%EB
I know how to convert ASCII to hex:
CString AsciiToHex(CString ascii)
CString hex=L"";
for (int i=0;i
That's not working with unicode.
So how can I convert unicode text to Hex values?Any help would be greatly appreciated.
-
Hi everyone. I need to convert unicode text to Hex values. For example:
символ = %F1%E8%EC%E2%EE%EB
I know how to convert ASCII to hex:
CString AsciiToHex(CString ascii)
CString hex=L"";
for (int i=0;i
That's not working with unicode.
So how can I convert unicode text to Hex values?Any help would be greatly appreciated.
Convert unicode text into byte stream, then convert that byte stream into hex.
-
Convert unicode text into byte stream, then convert that byte stream into hex.
-
Thank you Manish Rastogi. Did you mean char array by "byte stream"? I tried to convert my string to char array and then convert to hex. Still getting wrong results :) Can you give me an example, please?
msn92 wrote:
I tried to convert my string to char array and then convert to hex. Still getting wrong results Smile Can you give me an example, please?
Could you please post your code? :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles] -
msn92 wrote:
I tried to convert my string to char array and then convert to hex. Still getting wrong results Smile Can you give me an example, please?
Could you please post your code? :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles] -
Of course!
CString UnicodeToHex(CString unicode){
char* unicode_((char*)unicode.GetString());
CString hex=L"";
for (int i=0;i<unicode.GetLength();i++)
{
hex.Append(L"%");
hex.AppendFormat(L"%x",unicode_[i]);
}
return hex;
}msn92 wrote:
char* unicode_((char*)unicode.GetString());
unicode_
is serving no purpose in this code snippet. With input of%F1%E8%EC%E2%EE%EB
, what are you expecting the output to be?"Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
-
Of course!
CString UnicodeToHex(CString unicode){
char* unicode_((char*)unicode.GetString());
CString hex=L"";
for (int i=0;i<unicode.GetLength();i++)
{
hex.Append(L"%");
hex.AppendFormat(L"%x",unicode_[i]);
}
return hex;
}msn92 wrote:
символ = %F1%E8%EC%E2%EE%EB
The above sentence is wrong. The hexadecimal representation of the UNICODE characters
символ
is{ 0x0441, 0x0438, 0x043c, 0x0432, 0x043e, 0x043b}
.msn92 wrote:
for (int i=0;i<unicode.GetLength();i++)
In the above statement, the index
i
should run for(unicode.GetLength() * 2)
. :)If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles] -
msn92 wrote:
символ = %F1%E8%EC%E2%EE%EB
The above sentence is wrong. The hexadecimal representation of the UNICODE characters
символ
is{ 0x0441, 0x0438, 0x043c, 0x0432, 0x043e, 0x043b}
.msn92 wrote:
for (int i=0;i<unicode.GetLength();i++)
In the above statement, the index
i
should run for(unicode.GetLength() * 2)
. :)If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles]CPallini, thank you for your reply. I have a web form that I need to submit programmatically and data should be in unicode. When I submit that web form using Firefox with the text
символ
, this is what Firefox is sending to the server:%F1%E8%EC%E2%EE%EB
And I'm trying to send the same thing programmatically. So if
символ = %F1%E8%EC%E2%EE%EB
is wrong, why Firefox is sending it like that? Is it because ofcharset=WINDOWS-1251
in the source code of the web page:What should I do to convert to the same hex format Firefox is using? Please let me know, if I'm not clear enough.
-
CPallini, thank you for your reply. I have a web form that I need to submit programmatically and data should be in unicode. When I submit that web form using Firefox with the text
символ
, this is what Firefox is sending to the server:%F1%E8%EC%E2%EE%EB
And I'm trying to send the same thing programmatically. So if
символ = %F1%E8%EC%E2%EE%EB
is wrong, why Firefox is sending it like that? Is it because ofcharset=WINDOWS-1251
in the source code of the web page:What should I do to convert to the same hex format Firefox is using? Please let me know, if I'm not clear enough.
Ok, I figured it out. It is because of the charset (windows-1251). And to convert them to the hex format that firefox is using(windows-1251 charset), I had to replace each russian letter by hand with its hex equivalent in windows-1251 charset. I know, it's not the best way, but I couldn't find better one: Just in case, if someone needs it:
/*----------------------------------*/
//***Windows-1251 charset to Hex***
//If works, it is written by
//Siroj Matchanov(aka tech,msn92)
//if it doesn't I dunno who wrote it
//Recources used:
//http://www.science.co.il/language/Character-Code.asp?s=1251
/*----------------------------------*/void Win1251ToHex(
CString &input, /*(windows-1251/ascii charset)*/
CString &hex /*Output (in hex)*/
)
{
hex=L"";
for (int i=0;i<input.GetLength();i++)
{
//each char represents a unique number (e.g. int('Ў')=1038)
int ii=input[i];
//russian letters represent numbers that are greater than 1000
if(ii>1024){
//---------------------------------//
// Uppercase&Lowercase Letters //
// А-Я-а-я (1040-1071-1072-1103) //
//---------------------------------//
if((ii>1039)&&(ii<1104)){
BYTE x=0xC0;
x=x+(ii-1040);
hex.Append(L"%");
hex.AppendFormat(L"%x",x);
}else{
//------------------------//
// Special symbols //
//------------------------//
switch (ii)
{
case 8218:
{
hex.Append(L"%82");//‚
break;
}
case 8222:
{
hex.Append(L"%84");//„
break;
}
case 8230:
{
hex.Append(L"%85");//…
break;
}
case 1038:
{
hex.Append(L"%A1");//Ў
break;
}
case 1118:
{
hex.Append(L"%A2");//ў
break;
}
case 1025:
{
hex.Append(L"%A8");//Ё
break;
}
case 1105:
{
hex.Append(L"%B8");//ё
break;
}
case 8470:
{
hex.Append(L"%B9");//№
break;
}
}
}
//Because they are not used usually,
//this code doesn't convert the following symbols:
//ЂЃ‚ѓ„…†‡€‰Љ‹ЊЌЋЏђ‘’“”•–—™љ›њќћџЈ¤Ґ¦§©«¬®Ї°±Ііґµ¶·є»јЅѕї
//If you want them to be in windows-1251 charset format hex...
//...then continue the code yourself please :)
//(Almost) any other symbol in windows-1251 charset has
//the same hex value as ascii symbols do.
}else if(ii<127){
hex.Append(L"%");
hex.AppendFormat(L"%x",ascii[i]);
}
}
}modified on Sunday, August 30, 2009 2:09 AM