UTF-8 Encoding with File.CreateText
-
When you use File.CreateText and then pass a normal string to the resulting StreamWriter's Write method, does it automatically convert the text to UTF-8, or do I need to encode it first? Thanks. -Matt ------------------------------------------ The 3 great virtues of a programmer: Laziness, Impatience, and Hubris. --Larry Wall
-
When you use File.CreateText and then pass a normal string to the resulting StreamWriter's Write method, does it automatically convert the text to UTF-8, or do I need to encode it first? Thanks. -Matt ------------------------------------------ The 3 great virtues of a programmer: Laziness, Impatience, and Hubris. --Larry Wall
Strings in .NET are an abstract concept, they do not have any encoding but are just unicode (in their internal memory representation, they use UTF-16, but you'll never notice until you use unsafe pointers). Encoding is always done when converting from String to bytes, so yes, the StreamWriter does this for you.
-
Strings in .NET are an abstract concept, they do not have any encoding but are just unicode (in their internal memory representation, they use UTF-16, but you'll never notice until you use unsafe pointers). Encoding is always done when converting from String to bytes, so yes, the StreamWriter does this for you.
That's what I thought, but for some reason when I save out my file, it does not seem to be UTF-8 encoded. I know this because when you open a text file in notepad and then select File | Save As..., you will see in the "Encoding" combo box of the Save As dialog, the encoding of the current document. In this case it came up as ASCII. To test this, I saved the file out as the same file name using the UTF-8 encoding with notepad and then closed it and loaded it back into notepad again--once again selecting "Save As..." to see what the Encoding drop down defaults to. It came up as UTF-8 the second time. Is it really writing it out as UTF-8? Any ides? Thanks. -Matt ------------------------------------------ The 3 great virtues of a programmer: Laziness, Impatience, and Hubris. --Larry Wall
-
That's what I thought, but for some reason when I save out my file, it does not seem to be UTF-8 encoded. I know this because when you open a text file in notepad and then select File | Save As..., you will see in the "Encoding" combo box of the Save As dialog, the encoding of the current document. In this case it came up as ASCII. To test this, I saved the file out as the same file name using the UTF-8 encoding with notepad and then closed it and loaded it back into notepad again--once again selecting "Save As..." to see what the Encoding drop down defaults to. It came up as UTF-8 the second time. Is it really writing it out as UTF-8? Any ides? Thanks. -Matt ------------------------------------------ The 3 great virtues of a programmer: Laziness, Impatience, and Hubris. --Larry Wall
It is writing a UTF-8 stream without any header. Unless you have multi-byte characters, you can't tell the difference between a UTF-8 document and a normal ASCII document. To include the preamble header call:
StreamWriter(fileName, System.Text.Encoding.UTF8);
Anyone who thinks he has a better idea of what's good for people than people do is a swine. - P.J. O'Rourke -
It is writing a UTF-8 stream without any header. Unless you have multi-byte characters, you can't tell the difference between a UTF-8 document and a normal ASCII document. To include the preamble header call:
StreamWriter(fileName, System.Text.Encoding.UTF8);
Anyone who thinks he has a better idea of what's good for people than people do is a swine. - P.J. O'RourkeSuuuhweeet! Thanks. That's what I was looking for. -Matt ------------------------------------------ The 3 great virtues of a programmer: Laziness, Impatience, and Hubris. --Larry Wall