Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. XML encoding issue

XML encoding issue

Scheduled Pinned Locked Moved C#
csharpxmlhelptutorialquestion
17 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M mlsteeves

    You are using StringWriter, and it "Implements a TextWriter for writing information to a string." (http://msdn.microsoft.com/en-us/library/system.io.stringwriter.aspx[^])

    G Offline
    G Offline
    George_George
    wrote on last edited by
    #6

    Thanks wmba, 1. I have solved this issue from your help. Here is my code. Could you review whether it is correct please?

    using System;
    using System.Text;
    using System.IO;
    using System.Xml;

    class FSOpenWrite
    {
    public static void Main()
    {
    StringWriter stream = new StringWriter();
    XmlTextWriter writer = new XmlTextWriter(stream);
    writer.WriteStartElement("Stock");
    writer.WriteAttributeString("Symbol", "123");
    writer.WriteElementString("Price", "456");
    writer.WriteElementString("Change", "abc");
    writer.WriteElementString("Volume", "edd");
    writer.WriteEndElement();

        string content = stream.ToString();
    
        return;
    }
    

    }

    2. Why in my original code in question, even if I set UTF-16, but I can only use UTF-8 encoding? regards, George

    M 1 Reply Last reply
    0
    • P PIEBALDconsult

      George_George wrote:

      I am using StringWriter

      That doesn't write to a file, does it? Always use an XmlTextWriter for writing XML documents to files.

      G Offline
      G Offline
      George_George
      wrote on last edited by
      #7

      Thanks PIEBALDconsult, I only need a memory representation (string) for XML. No need to write to a file. My question is, why even if I set UTF-8 property, but in my original question and code, UTF-16 header is displayed? regards, George

      P 1 Reply Last reply
      0
      • G George_George

        Thanks PIEBALDconsult, I only need a memory representation (string) for XML. No need to write to a file. My question is, why even if I set UTF-8 property, but in my original question and code, UTF-16 header is displayed? regards, George

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #8

        I'm guessing that it's because .net strings are two-byte Unicode, but I could easily be wrong.

        G 1 Reply Last reply
        0
        • P PIEBALDconsult

          I'm guessing that it's because .net strings are two-byte Unicode, but I could easily be wrong.

          G Offline
          G Offline
          George_George
          wrote on last edited by
          #9

          Thanks PIEBALDconsult, I agree C# is using UTF-16 as internal encoding approach, but why the XML header UTF-8 which is already set is overwritten by UTF-16? regards, George

          P 1 Reply Last reply
          0
          • G George_George

            Thanks PIEBALDconsult, I agree C# is using UTF-16 as internal encoding approach, but why the XML header UTF-8 which is already set is overwritten by UTF-16? regards, George

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #10

            Because doing otherwise would be wrong. What problem are you trying to solve?

            G 1 Reply Last reply
            0
            • P PIEBALDconsult

              Because doing otherwise would be wrong. What problem are you trying to solve?

              G Offline
              G Offline
              George_George
              wrote on last edited by
              #11

              Thanks PIEBALDconsult, I do not quite understand why I set UTF-8 header, but UTF-16 is output in my original sample. What is the internal operations which steals and changes my original header? :-) regards, George

              P 1 Reply Last reply
              0
              • G George_George

                Thanks PIEBALDconsult, I do not quite understand why I set UTF-8 header, but UTF-16 is output in my original sample. What is the internal operations which steals and changes my original header? :-) regards, George

                P Offline
                P Offline
                PIEBALDconsult
                wrote on last edited by
                #12

                The XmlDocument.Save and XmlTextWriter operation will only write well-formed XML. It knows that the StringWriter uses UTF-16 so it sets the proper encoding. Encoding in UTF-16, but saying it's UTF-8 would yield mal-formed XML. If you want UTF-8, write it to a file, a StringBuilder won't do it.

                G 1 Reply Last reply
                0
                • G George_George

                  Thanks wmba, 1. I have solved this issue from your help. Here is my code. Could you review whether it is correct please?

                  using System;
                  using System.Text;
                  using System.IO;
                  using System.Xml;

                  class FSOpenWrite
                  {
                  public static void Main()
                  {
                  StringWriter stream = new StringWriter();
                  XmlTextWriter writer = new XmlTextWriter(stream);
                  writer.WriteStartElement("Stock");
                  writer.WriteAttributeString("Symbol", "123");
                  writer.WriteElementString("Price", "456");
                  writer.WriteElementString("Change", "abc");
                  writer.WriteElementString("Volume", "edd");
                  writer.WriteEndElement();

                      string content = stream.ToString();
                  
                      return;
                  }
                  

                  }

                  2. Why in my original code in question, even if I set UTF-16, but I can only use UTF-8 encoding? regards, George

                  M Offline
                  M Offline
                  mlsteeves
                  wrote on last edited by
                  #13

                  With your code sample, you are missing the part to tells the XmlTextWriter what encoding to use. If you use any class that is derived from a TextWriter (like StringWriter), then you can't specify the encoding. The reason for this is that the base string in a StringWriter is UTF-16, so you have no options for using a different Encoding. If however, you use a MemoryStream, or something derived directly from Stream, then you can specify a different Encoding. Anyway, here is a code snippet that describes this:

                          MemoryStream ms = new MemoryStream();
                  
                          //Set the encoding to UTF8:
                          XmlTextWriter writer = new XmlTextWriter(ms, Encoding.UTF8);
                  
                          //Just makes the xml easier to read:
                          writer.Formatting = Formatting.Indented;
                  
                          //Write out our xml document:
                          writer.WriteStartDocument();
                          writer.WriteStartElement("Stock");
                          writer.WriteAttributeString("Symbol", "123");
                          writer.WriteElementString("Price", "456");
                          writer.WriteElementString("Change", "abc");
                          writer.WriteElementString("Volume", "edd");
                          writer.WriteEndElement();
                  
                          //Reset our stream's read pointer, so we can read back from our memory stream:
                          writer.Flush();
                          ms.Seek(0, SeekOrigin.Begin);
                  
                          //Read our memory stream, and output to console:
                          StreamReader sr = new StreamReader(ms);
                          string content = sr.ReadToEnd();
                          Console.WriteLine(content);
                  
                          return;
                  

                  It is important to note that you could have used a similar technique in your original code when you used the XmlDocument. The reason why you were getting the UTF-16 encoding is because your underlying writer class was a string. StringWriter writes directly to a string (or possibly a StringBuilder). And because strings in .NET are all UTF-16, that is the encoding you got. When you write directly to a stream (FileStream, MemoryStream, etc), then you are not writing to a string, but conceptually you are writing to just an array of bytes. Because of that you can specify a different encoding. Anyway, I hope this helps you out.

                  G 1 Reply Last reply
                  0
                  • P PIEBALDconsult

                    The XmlDocument.Save and XmlTextWriter operation will only write well-formed XML. It knows that the StringWriter uses UTF-16 so it sets the proper encoding. Encoding in UTF-16, but saying it's UTF-8 would yield mal-formed XML. If you want UTF-8, write it to a file, a StringBuilder won't do it.

                    G Offline
                    G Offline
                    George_George
                    wrote on last edited by
                    #14

                    Can I set the encoding of StringWriter from UTF-16 to UTF-8? regards, George

                    P 1 Reply Last reply
                    0
                    • M mlsteeves

                      With your code sample, you are missing the part to tells the XmlTextWriter what encoding to use. If you use any class that is derived from a TextWriter (like StringWriter), then you can't specify the encoding. The reason for this is that the base string in a StringWriter is UTF-16, so you have no options for using a different Encoding. If however, you use a MemoryStream, or something derived directly from Stream, then you can specify a different Encoding. Anyway, here is a code snippet that describes this:

                              MemoryStream ms = new MemoryStream();
                      
                              //Set the encoding to UTF8:
                              XmlTextWriter writer = new XmlTextWriter(ms, Encoding.UTF8);
                      
                              //Just makes the xml easier to read:
                              writer.Formatting = Formatting.Indented;
                      
                              //Write out our xml document:
                              writer.WriteStartDocument();
                              writer.WriteStartElement("Stock");
                              writer.WriteAttributeString("Symbol", "123");
                              writer.WriteElementString("Price", "456");
                              writer.WriteElementString("Change", "abc");
                              writer.WriteElementString("Volume", "edd");
                              writer.WriteEndElement();
                      
                              //Reset our stream's read pointer, so we can read back from our memory stream:
                              writer.Flush();
                              ms.Seek(0, SeekOrigin.Begin);
                      
                              //Read our memory stream, and output to console:
                              StreamReader sr = new StreamReader(ms);
                              string content = sr.ReadToEnd();
                              Console.WriteLine(content);
                      
                              return;
                      

                      It is important to note that you could have used a similar technique in your original code when you used the XmlDocument. The reason why you were getting the UTF-16 encoding is because your underlying writer class was a string. StringWriter writes directly to a string (or possibly a StringBuilder). And because strings in .NET are all UTF-16, that is the encoding you got. When you write directly to a stream (FileStream, MemoryStream, etc), then you are not writing to a string, but conceptually you are writing to just an array of bytes. Because of that you can specify a different encoding. Anyway, I hope this helps you out.

                      G Offline
                      G Offline
                      George_George
                      wrote on last edited by
                      #15

                      I like your sample, wmba! So, cool!! :-) regards, George

                      1 Reply Last reply
                      0
                      • G George_George

                        Can I set the encoding of StringWriter from UTF-16 to UTF-8? regards, George

                        P Offline
                        P Offline
                        PIEBALDconsult
                        wrote on last edited by
                        #16

                        NO, goddammit! You can't! .net strings are UTF-16, and that's it, end of story!

                        G 1 Reply Last reply
                        0
                        • P PIEBALDconsult

                          NO, goddammit! You can't! .net strings are UTF-16, and that's it, end of story!

                          G Offline
                          G Offline
                          George_George
                          wrote on last edited by
                          #17

                          Thanks PIEBALDconsult, I have solved this issue by using MemoryStream. :-) regards, George

                          1 Reply Last reply
                          0
                          Reply
                          • Reply as topic
                          Log in to reply
                          • Oldest to Newest
                          • Newest to Oldest
                          • Most Votes


                          • Login

                          • Don't have an account? Register

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • World
                          • Users
                          • Groups