COM interop and BSTR
-
I have a COM object that returns a binary string in a BSTR. .NET marshals this to a string. The problem is that some of the bytes get dropped when this happens. As an example, load a PFD file into a byte array. Convert the byte array to a string, and then convert the string back to a byte array (UTF-8). If you compare the two byte arrays, they aren't the same. I probably haven't explaied this very well, but I hope I haven't made it too confusing. Has anyone had any experience with this, and what can I do to make it work? Thanks.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
-
I have a COM object that returns a binary string in a BSTR. .NET marshals this to a string. The problem is that some of the bytes get dropped when this happens. As an example, load a PFD file into a byte array. Convert the byte array to a string, and then convert the string back to a byte array (UTF-8). If you compare the two byte arrays, they aren't the same. I probably haven't explaied this very well, but I hope I haven't made it too confusing. Has anyone had any experience with this, and what can I do to make it work? Thanks.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
Do you mean PDF file? If so what has loading the file got to do with COM? I have written quite a bit of code moving, loading PDFs etc. And I do it as Streams and byte arrays and have yet to see a problem. Perhaps if you explain in more detail I may be able to help.
-
Do you mean PDF file? If so what has loading the file got to do with COM? I have written quite a bit of code moving, loading PDFs etc. And I do it as Streams and byte arrays and have yet to see a problem. Perhaps if you explain in more detail I may be able to help.
The COM object creates the PDF file on the fly. It's a third party control, so there isn't too much we can do about it. I think my example of loading and manipulating a PDF file isn't the best example, because it doesn't really simulate the problem. However, just out of curiousity, is there anything wrong with the following code:
// Assume that pdfFile is a FileSteam object that has opened a pdf file, // length is an int initialized to the size of the file, and that pdfBytes // is a byte array of size length. pdfFile.Read(pdfBytes, 0, length) // Convert the pdfBytes array to a string string pdfString = Encoding.UTF8.GetString(pdfBytes); // Convert the string to a new byte array string pdfBytes2 = Encoding.Unicode.GetBytes(pdfString);
At this point, pdfBytes and pdfBytes2 should be the same, correct? This is not the behavior I'm seeing. I'm new to C#, and probably have made a very obvious mistake, but I would really appreciate someone pointing it out to me.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
-
The COM object creates the PDF file on the fly. It's a third party control, so there isn't too much we can do about it. I think my example of loading and manipulating a PDF file isn't the best example, because it doesn't really simulate the problem. However, just out of curiousity, is there anything wrong with the following code:
// Assume that pdfFile is a FileSteam object that has opened a pdf file, // length is an int initialized to the size of the file, and that pdfBytes // is a byte array of size length. pdfFile.Read(pdfBytes, 0, length) // Convert the pdfBytes array to a string string pdfString = Encoding.UTF8.GetString(pdfBytes); // Convert the string to a new byte array string pdfBytes2 = Encoding.Unicode.GetBytes(pdfString);
At this point, pdfBytes and pdfBytes2 should be the same, correct? This is not the behavior I'm seeing. I'm new to C#, and probably have made a very obvious mistake, but I would really appreciate someone pointing it out to me.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
Hmmm, not sure why the behaviour would be like that. I never have to convert the byte array in a string. I just carry it around till I have to write it out. I am not sure what happens behind the scene when the byte array is converted to a string. Maybe some stuff does not convert. What if you create a byte array with simple text or numbers and try the same. If that works then maybe its something in the PDF that does not convert well to a string. Just guessing. Sorry could not be of more help.
-
The COM object creates the PDF file on the fly. It's a third party control, so there isn't too much we can do about it. I think my example of loading and manipulating a PDF file isn't the best example, because it doesn't really simulate the problem. However, just out of curiousity, is there anything wrong with the following code:
// Assume that pdfFile is a FileSteam object that has opened a pdf file, // length is an int initialized to the size of the file, and that pdfBytes // is a byte array of size length. pdfFile.Read(pdfBytes, 0, length) // Convert the pdfBytes array to a string string pdfString = Encoding.UTF8.GetString(pdfBytes); // Convert the string to a new byte array string pdfBytes2 = Encoding.Unicode.GetBytes(pdfString);
At this point, pdfBytes and pdfBytes2 should be the same, correct? This is not the behavior I'm seeing. I'm new to C#, and probably have made a very obvious mistake, but I would really appreciate someone pointing it out to me.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
Unicode is UTF-16, which may be slightly different from UTF-8. :~
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
The COM object creates the PDF file on the fly. It's a third party control, so there isn't too much we can do about it. I think my example of loading and manipulating a PDF file isn't the best example, because it doesn't really simulate the problem. However, just out of curiousity, is there anything wrong with the following code:
// Assume that pdfFile is a FileSteam object that has opened a pdf file, // length is an int initialized to the size of the file, and that pdfBytes // is a byte array of size length. pdfFile.Read(pdfBytes, 0, length) // Convert the pdfBytes array to a string string pdfString = Encoding.UTF8.GetString(pdfBytes); // Convert the string to a new byte array string pdfBytes2 = Encoding.Unicode.GetBytes(pdfString);
At this point, pdfBytes and pdfBytes2 should be the same, correct? This is not the behavior I'm seeing. I'm new to C#, and probably have made a very obvious mistake, but I would really appreciate someone pointing it out to me.
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
The interop part of marshaling the BSTR to a string was working ok. The problem was with the conversion from a Unicode array to a byte array. The correct converstion is:
byte[] pdfBytes = Encoding.Convert(Encoding.Unicode, Encoding.Default, Encoding.Unicode.GetBytes(pdfString));
What I had been trying to convert to UTF-8, which was what was causing the problems. Converting to Default solved the problem. Thanks for all the comments. Dan
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
-
The interop part of marshaling the BSTR to a string was working ok. The problem was with the conversion from a Unicode array to a byte array. The correct converstion is:
byte[] pdfBytes = Encoding.Convert(Encoding.Unicode, Encoding.Default, Encoding.Unicode.GetBytes(pdfString));
What I had been trying to convert to UTF-8, which was what was causing the problems. Converting to Default solved the problem. Thanks for all the comments. Dan
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. --Benjamin Franklin, 1759
Thanks for posting the solution. Am sure I will need it one day. :-)