COleVariant, Unicode and truncated strings
-
I'm working on an app that needs to work with both ascii and unicode, but I seem to have run into a rather frustrating problem. It seems that for some reason, whenever I pass a unicode string to COleVariant.SetString(), the bstrVal only contains the unicode string up to the first NULL character in the string(regardless of the VARTYPE parameter). Hence most of the unicode strings are prematurely truncated. If I remember my unicode correctly it requires a double NULL to indicate the end of a unicode string does it not? Anybody know of why this may be and how I can fix it? When I die I'd like to go peacefully in my sleep like my father, not screaming in terror like his passengers!!!
-
I'm working on an app that needs to work with both ascii and unicode, but I seem to have run into a rather frustrating problem. It seems that for some reason, whenever I pass a unicode string to COleVariant.SetString(), the bstrVal only contains the unicode string up to the first NULL character in the string(regardless of the VARTYPE parameter). Hence most of the unicode strings are prematurely truncated. If I remember my unicode correctly it requires a double NULL to indicate the end of a unicode string does it not? Anybody know of why this may be and how I can fix it? When I die I'd like to go peacefully in my sleep like my father, not screaming in terror like his passengers!!!
While a
BSTR
can technically contain any data at all, it's most often used to hold a C-style (zero-terminated) UTF-16 string. ManyBSTR
wrappers also make the same assumption. Step into theCOleVariant
code and see if it is doing this.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ VB > soccer
-
I'm working on an app that needs to work with both ascii and unicode, but I seem to have run into a rather frustrating problem. It seems that for some reason, whenever I pass a unicode string to COleVariant.SetString(), the bstrVal only contains the unicode string up to the first NULL character in the string(regardless of the VARTYPE parameter). Hence most of the unicode strings are prematurely truncated. If I remember my unicode correctly it requires a double NULL to indicate the end of a unicode string does it not? Anybody know of why this may be and how I can fix it? When I die I'd like to go peacefully in my sleep like my father, not screaming in terror like his passengers!!!
If no other solution, you can use an alternative approach, based on STL strings:
std::wstring
for Unicode orstd::string
for ANSI. This kind of strings allows null character within the string. For instance:std::wstring s = L"abc"; s += L'\0'; // put null character after 'c' s += L"def"; size_t length = s.length(); // now length is 7 wchar_t c; c = s[2]; // now c is 'c' c = s[3]; // now c is 0 c = s[4]; // now c is 'd' const wchar_t * p = s.c_str(); // now p points to array of 8 characters, including final null character.
Next, if you need this as OLE variant, I think you can use "safe arrays" for this. For instance, this fragment copies the above string to a safe array:
CComSafeArray< USHORT > sa; sa.Add(s.length() + 1, (USHORT*)p);
Hope it helps. -- modified at 3:58 Friday 16th June, 2006
-
If no other solution, you can use an alternative approach, based on STL strings:
std::wstring
for Unicode orstd::string
for ANSI. This kind of strings allows null character within the string. For instance:std::wstring s = L"abc"; s += L'\0'; // put null character after 'c' s += L"def"; size_t length = s.length(); // now length is 7 wchar_t c; c = s[2]; // now c is 'c' c = s[3]; // now c is 0 c = s[4]; // now c is 'd' const wchar_t * p = s.c_str(); // now p points to array of 8 characters, including final null character.
Next, if you need this as OLE variant, I think you can use "safe arrays" for this. For instance, this fragment copies the above string to a safe array:
CComSafeArray< USHORT > sa; sa.Add(s.length() + 1, (USHORT*)p);
Hope it helps. -- modified at 3:58 Friday 16th June, 2006
Thanks for the suggestions. I managed to fix(hack) it. I was already using safe arrays as the task I'm performing must communicate with older COM and OLE objects. The problem seemed to be an error in one of the OLE cpp files in the MSVS directory. It seemed to call a standard strlen function regardless of the input type so I've changed the code to check for unicode and call wcslen if needed. When I die I'd like to go peacefully in my sleep like my father, not screaming in terror like his passengers!!!