System.Xml.XmlDocument

AlgorithmsBorne

XmlDocument xmlDocument = new XmlDocument();
//this throws an error
try
{
xmlDocument.LoadXml("<root>this is unencoded inner text & < > </root>");
}
catch
{
}

//Parses just fine
xmlDocument.LoadXml("<root>this is encoded inner text & < ></root>");

//sweet I don't have to think about encoding.........
//because this will encode for me
xmlDocument.DocumentElement.InnerText = "unencoded & < >";

string encodedOuterXml = xmlDocument.OuterXml;
//encodedOuterXml will be unencoded & < >

//but what if I do think about encoding
xmlDocument.DocumentElement.InnerText = "& < >";

string outerXml = xmlDocument.OuterXml;
//outerXml will be "& < >"

/*
My point is that IMO inconsistent coding of a class is, in fact, a bug
and that before encoding characters they need to check to see if it is already encoded

it is the same all around the BCL and FCL.

Another example of this is:
*
Uri.Query += "";

the more you use += to append a query, the more ? it will add after the path
*/

Enjoi

Steve Hansen

I don't believe the first behavior is a bug. The documentation clearly states that InnerText will be encoded. If you do want to encode yourself then just write to the InnerXml property. :)

PIEBALDconsult

Yeah, use the right tool for the right job.

Guffa

I don't understand what it is that you think is inconsistent?

--- It's amazing to see how much work some people will go through just to avoid a little bit of work.

AlgorithmsBorne

The implementation of the InnerText without taking into account the InnerXml. Basically the idea that you had to assign the InnerText invalid Xml to get properly formated output.

Guffa

SharperAgent wrote:

Basically the idea that you had to assign the InnerText invalid Xml to get properly formated output.

You are not supposed to assign any XML to the InnerText property, you are supposed to assign text to it. That's why it's called InnerText and not InnerXml.

--- It's amazing to see how much work some people will go through just to avoid a little bit of work.

AlgorithmsBorne

Regardless it is accepting invalid XML and input. InnerText should not parse any of its contents into the tree, where as InnerXml should. But it all boils down to the fact that it accepts INVALID XML and doesn't validate the encoding if VALID XML is put into it, so it double encodes VALID XML. Enjoi

PIEBALDconsult

Of course it does, XML is text. If you want it handled the same as non-XML text it must be set with InnerText, the same as other text. For example: If I have a text messaging program that uses XML something like <Message Sender="Me" Recipient="You">message text</Message> That "message text" is set with the InnerText property. If I send you a well-formed snippet of XML (e.g. <SomeData>Here's the data</SomeData>) I do not want to use InnerXML, I want the XML "text" to be encoded just like any other text. Correct, using InnerText: <Message Sender="Me" Recipient="You"><SomeData>Here's the data</SomeData></Message> Incorrect, using InnerXML: <Message Sender="Me" Recipient="You"><SomeData>Here's the data</SomeData></Message> Both are well-formed, but the latter is unlikely to be valid. It could be worse if I didn't use attributes (which may be a good reason to use attributes): <Message> <Sender>Me</Sender> <Recipient>You</Recipient> <Sender>GWB</Sender> </Message> Whoops, let's make that: <Message> <Sender>Me</Sender> <Recipient>You</Recipient> <Text> <Sender>GWB</Sender> </Text> </Message> Hopefully it will still fail validation, because if not GetElementsByTagName ( "Sender" ) will return both Sender elements.

PIEBALDconsult

If you want to choose InnerText or InnerXml depending on the well-formedness of the value, try something like: try { t.LoadXml ( args [ 0 ] ) ; e.InnerXml = args [ 0 ] ; } catch { e.InnerText = args [ 0 ] ; }

Guffa

AlgorithmsBorne wrote:

But it all boils down to the fact that it accepts INVALID XML and doesn't validate the encoding if VALID XML is put into it, so it double encodes VALID XML.

No, it doesn't accept invalid XML. It accepts text, not XML. If the text happens to be XML is irrelevant, it should still treat it as text, and it does. If it would treat it as XML, the implementation would be inconsistent.

--- It's amazing to see how much work some people will go through just to avoid a little bit of work.