Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. XML encoding issue

XML encoding issue

Scheduled Pinned Locked Moved C#
csharpxmlhelptutorialquestion
17 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P PIEBALDconsult

    George_George wrote:

    I am using StringWriter

    That doesn't write to a file, does it? Always use an XmlTextWriter for writing XML documents to files.

    G Offline
    G Offline
    George_George
    wrote on last edited by
    #7

    Thanks PIEBALDconsult, I only need a memory representation (string) for XML. No need to write to a file. My question is, why even if I set UTF-8 property, but in my original question and code, UTF-16 header is displayed? regards, George

    P 1 Reply Last reply
    0
    • G George_George

      Thanks PIEBALDconsult, I only need a memory representation (string) for XML. No need to write to a file. My question is, why even if I set UTF-8 property, but in my original question and code, UTF-16 header is displayed? regards, George

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #8

      I'm guessing that it's because .net strings are two-byte Unicode, but I could easily be wrong.

      G 1 Reply Last reply
      0
      • P PIEBALDconsult

        I'm guessing that it's because .net strings are two-byte Unicode, but I could easily be wrong.

        G Offline
        G Offline
        George_George
        wrote on last edited by
        #9

        Thanks PIEBALDconsult, I agree C# is using UTF-16 as internal encoding approach, but why the XML header UTF-8 which is already set is overwritten by UTF-16? regards, George

        P 1 Reply Last reply
        0
        • G George_George

          Thanks PIEBALDconsult, I agree C# is using UTF-16 as internal encoding approach, but why the XML header UTF-8 which is already set is overwritten by UTF-16? regards, George

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #10

          Because doing otherwise would be wrong. What problem are you trying to solve?

          G 1 Reply Last reply
          0
          • P PIEBALDconsult

            Because doing otherwise would be wrong. What problem are you trying to solve?

            G Offline
            G Offline
            George_George
            wrote on last edited by
            #11

            Thanks PIEBALDconsult, I do not quite understand why I set UTF-8 header, but UTF-16 is output in my original sample. What is the internal operations which steals and changes my original header? :-) regards, George

            P 1 Reply Last reply
            0
            • G George_George

              Thanks PIEBALDconsult, I do not quite understand why I set UTF-8 header, but UTF-16 is output in my original sample. What is the internal operations which steals and changes my original header? :-) regards, George

              P Offline
              P Offline
              PIEBALDconsult
              wrote on last edited by
              #12

              The XmlDocument.Save and XmlTextWriter operation will only write well-formed XML. It knows that the StringWriter uses UTF-16 so it sets the proper encoding. Encoding in UTF-16, but saying it's UTF-8 would yield mal-formed XML. If you want UTF-8, write it to a file, a StringBuilder won't do it.

              G 1 Reply Last reply
              0
              • G George_George

                Thanks wmba, 1. I have solved this issue from your help. Here is my code. Could you review whether it is correct please?

                using System;
                using System.Text;
                using System.IO;
                using System.Xml;

                class FSOpenWrite
                {
                public static void Main()
                {
                StringWriter stream = new StringWriter();
                XmlTextWriter writer = new XmlTextWriter(stream);
                writer.WriteStartElement("Stock");
                writer.WriteAttributeString("Symbol", "123");
                writer.WriteElementString("Price", "456");
                writer.WriteElementString("Change", "abc");
                writer.WriteElementString("Volume", "edd");
                writer.WriteEndElement();

                    string content = stream.ToString();
                
                    return;
                }
                

                }

                2. Why in my original code in question, even if I set UTF-16, but I can only use UTF-8 encoding? regards, George

                M Offline
                M Offline
                mlsteeves
                wrote on last edited by
                #13

                With your code sample, you are missing the part to tells the XmlTextWriter what encoding to use. If you use any class that is derived from a TextWriter (like StringWriter), then you can't specify the encoding. The reason for this is that the base string in a StringWriter is UTF-16, so you have no options for using a different Encoding. If however, you use a MemoryStream, or something derived directly from Stream, then you can specify a different Encoding. Anyway, here is a code snippet that describes this:

                        MemoryStream ms = new MemoryStream();
                
                        //Set the encoding to UTF8:
                        XmlTextWriter writer = new XmlTextWriter(ms, Encoding.UTF8);
                
                        //Just makes the xml easier to read:
                        writer.Formatting = Formatting.Indented;
                
                        //Write out our xml document:
                        writer.WriteStartDocument();
                        writer.WriteStartElement("Stock");
                        writer.WriteAttributeString("Symbol", "123");
                        writer.WriteElementString("Price", "456");
                        writer.WriteElementString("Change", "abc");
                        writer.WriteElementString("Volume", "edd");
                        writer.WriteEndElement();
                
                        //Reset our stream's read pointer, so we can read back from our memory stream:
                        writer.Flush();
                        ms.Seek(0, SeekOrigin.Begin);
                
                        //Read our memory stream, and output to console:
                        StreamReader sr = new StreamReader(ms);
                        string content = sr.ReadToEnd();
                        Console.WriteLine(content);
                
                        return;
                

                It is important to note that you could have used a similar technique in your original code when you used the XmlDocument. The reason why you were getting the UTF-16 encoding is because your underlying writer class was a string. StringWriter writes directly to a string (or possibly a StringBuilder). And because strings in .NET are all UTF-16, that is the encoding you got. When you write directly to a stream (FileStream, MemoryStream, etc), then you are not writing to a string, but conceptually you are writing to just an array of bytes. Because of that you can specify a different encoding. Anyway, I hope this helps you out.

                G 1 Reply Last reply
                0
                • P PIEBALDconsult

                  The XmlDocument.Save and XmlTextWriter operation will only write well-formed XML. It knows that the StringWriter uses UTF-16 so it sets the proper encoding. Encoding in UTF-16, but saying it's UTF-8 would yield mal-formed XML. If you want UTF-8, write it to a file, a StringBuilder won't do it.

                  G Offline
                  G Offline
                  George_George
                  wrote on last edited by
                  #14

                  Can I set the encoding of StringWriter from UTF-16 to UTF-8? regards, George

                  P 1 Reply Last reply
                  0
                  • M mlsteeves

                    With your code sample, you are missing the part to tells the XmlTextWriter what encoding to use. If you use any class that is derived from a TextWriter (like StringWriter), then you can't specify the encoding. The reason for this is that the base string in a StringWriter is UTF-16, so you have no options for using a different Encoding. If however, you use a MemoryStream, or something derived directly from Stream, then you can specify a different Encoding. Anyway, here is a code snippet that describes this:

                            MemoryStream ms = new MemoryStream();
                    
                            //Set the encoding to UTF8:
                            XmlTextWriter writer = new XmlTextWriter(ms, Encoding.UTF8);
                    
                            //Just makes the xml easier to read:
                            writer.Formatting = Formatting.Indented;
                    
                            //Write out our xml document:
                            writer.WriteStartDocument();
                            writer.WriteStartElement("Stock");
                            writer.WriteAttributeString("Symbol", "123");
                            writer.WriteElementString("Price", "456");
                            writer.WriteElementString("Change", "abc");
                            writer.WriteElementString("Volume", "edd");
                            writer.WriteEndElement();
                    
                            //Reset our stream's read pointer, so we can read back from our memory stream:
                            writer.Flush();
                            ms.Seek(0, SeekOrigin.Begin);
                    
                            //Read our memory stream, and output to console:
                            StreamReader sr = new StreamReader(ms);
                            string content = sr.ReadToEnd();
                            Console.WriteLine(content);
                    
                            return;
                    

                    It is important to note that you could have used a similar technique in your original code when you used the XmlDocument. The reason why you were getting the UTF-16 encoding is because your underlying writer class was a string. StringWriter writes directly to a string (or possibly a StringBuilder). And because strings in .NET are all UTF-16, that is the encoding you got. When you write directly to a stream (FileStream, MemoryStream, etc), then you are not writing to a string, but conceptually you are writing to just an array of bytes. Because of that you can specify a different encoding. Anyway, I hope this helps you out.

                    G Offline
                    G Offline
                    George_George
                    wrote on last edited by
                    #15

                    I like your sample, wmba! So, cool!! :-) regards, George

                    1 Reply Last reply
                    0
                    • G George_George

                      Can I set the encoding of StringWriter from UTF-16 to UTF-8? regards, George

                      P Offline
                      P Offline
                      PIEBALDconsult
                      wrote on last edited by
                      #16

                      NO, goddammit! You can't! .net strings are UTF-16, and that's it, end of story!

                      G 1 Reply Last reply
                      0
                      • P PIEBALDconsult

                        NO, goddammit! You can't! .net strings are UTF-16, and that's it, end of story!

                        G Offline
                        G Offline
                        George_George
                        wrote on last edited by
                        #17

                        Thanks PIEBALDconsult, I have solved this issue by using MemoryStream. :-) regards, George

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups