Why is XML?
-
I retired after 4 decades in software engineering in 2014. Though I used XML extensively during my career, I found it more of a nuisance than anything else. XML and JSON merely serve to add layers of software to handle the formats, making them both rather inefficient. And both are text-based. Similarly comma-delimited data is text based as well without all of the extra meta-data and when encrypted would produce smaller files or data-packets for transmission. For most situations, one can use comma-delimited data in the same ways as XML with a little ingenuity and without all the extra meta-data.
Steve Naidamast Sr. Software Engineer Black Falcon Software, Inc. blackfalconsoftware@outlook.com
Well, the metadata is something I find incredibly useful, and this is also why I tend to prefer XML over JSON as well when I need a robust way to transfer data of more than trivial complexity.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
I always hated working with XML. JSON is a god send. I had to serialize from JSON to XML for a file upload. It was mandated that way.. not my choice. Anyway I now have an XML Serializer that is stupid simple to use. Keep It Simple, keep it moving.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Xml wasn't originally written for web service data transfer or serialization/deserialization. It was written by the W3c to replace Html but still be Html-like. Xml is a Mark-up language, hence it has mark-up. Mark-up makes it good for readability by humans but also a standard readable by machines. Xml was then hi-jacked to be used by SOAP web services with serialization/deserialization. Then someone realized that json was better for serialization/deserialization, especially since readability by both humans and machines wasn't necessary, it only needs to be read by machines. JSON also has room for improvement in verbosity and as soon as a good replacement exists, people will say the same things: why json when new-thing is better.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
I've had to generate a file from a database, but the built-in methods didn't work for me, so I also just constructed it all by adding to a string.
-
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java. Kid of like the same reason that C# exists. I think he was being sarcastic but am not totally sure. He did show that history of the two and which came first is debatable. Both have roots that run back a long, long time ago.
So many years of programming I have forgotten more languages than I know.
Every big company is afraid of being sued, but I doubt that's the reason. Microsoft would more likely choose a competing solution in order to lock out a competitor. The story doesn't seem "right" but who knows. The Microsoft of today is a very, very different company than the Microsoft of 2000. (and it makes them better and worse)
cheers Chris Maunder
-
Chris Maunder wrote:
as something that is not seen or edited by humans
I could've sworn when I first started reading about XML, it was being sold based on the idea that it was trivially easy for people to read and write.
Ah, marketing...
cheers Chris Maunder
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Hungary's National Tax and Customs Agency requires real-time XML invoice reporting[^] , so we do it. It requires the schema designer to know his/her art, because xsd.exe[^] can choke on
and
type brainless design. Other pain was that while the XML standard is happy with a default namespace, XPath requires a prefix[^].
-
Xml wasn't originally written for web service data transfer or serialization/deserialization. It was written by the W3c to replace Html but still be Html-like. Xml is a Mark-up language, hence it has mark-up. Mark-up makes it good for readability by humans but also a standard readable by machines. Xml was then hi-jacked to be used by SOAP web services with serialization/deserialization. Then someone realized that json was better for serialization/deserialization, especially since readability by both humans and machines wasn't necessary, it only needs to be read by machines. JSON also has room for improvement in verbosity and as soon as a good replacement exists, people will say the same things: why json when new-thing is better.
That brought back XHTML nightmares...
cheers Chris Maunder
-
Ah, marketing...
cheers Chris Maunder
-
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java. Kid of like the same reason that C# exists. I think he was being sarcastic but am not totally sure. He did show that history of the two and which came first is debatable. Both have roots that run back a long, long time ago.
So many years of programming I have forgotten more languages than I know.
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java
This doesn't hold up, even if only because it seems backwards. If Wikipedia's accurate, work on XML started in 1996, and [became a "W3C recommendation" in 1998](https://en.wikipedia.org/wiki/XML#History), while JSON only started showing up in the "early 2000s" (granted, with some references to work starting in 1999 - but it was still very early in its design by then). And how is JSON in any way "like Java"? One's a data storage file format. The other's a full-blown programming language.
-
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java. Kid of like the same reason that C# exists. I think he was being sarcastic but am not totally sure. He did show that history of the two and which came first is debatable. Both have roots that run back a long, long time ago.
So many years of programming I have forgotten more languages than I know.
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java.
Then your professor, by "XML exists", presumably meant "XML did not die" rather than "XML was created". XML predates the first JSON RFC by ten years. And, XML was in use for several years before it was formally standardized. I really do not see how Microsoft gets into this. MS certainly neither defined XML nor JSON. I never saw Microsoft as a very active promoter of XML. C# was created by MS. I am not (yet) able to find on the net any documentation of the MS/Sun controversy, but twenty years ago "everybody knew" that C# was a response to Sun not allowing MS to use Java as it wanted. (If my memory is correct, MS wanted to add language features that Sun did not approve of.) So C# is a very different story from XML/Json. XML syntax borrows a lot from far older formats: Typesetting systems of the late 70s (maybe even older) used the same style bracketed keywords, e.g. to delimit paragraphs and specify paragraph formatting. You can see a selection of such tags e.g. in the 1982 Historical Manuals: Guide to Typesetting at the UKCC[^], at page 14-15. In the typesetting systems I ever touched, the brackets were displayed as common brackets, but had a different internal representation, and distinct keys on the dedicated terminals. So there was no need for escaping or other special handling of the common brackets (or math smaller/greater than).
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Do you remember what we had before XML? Talk about misery! When I joined here, I was working at an Ace Hardware, and trying to get our in house system to integrate with the Ace Corporate online ordering system (no Internet then, direct dialup connection) required the patience of Job, along with a love of self abuse. I'm still grateful for XML!
Will Rogers never met me.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
-
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
Yes, the discussion certainly drifted a little from "I am losing my mind trying to coerce a class to serialise to XML using the .NET serialisers", and the dumb hacks you do when you just need to get something done and don't actually need the magic the classes provide. XML, to me, followed the classic arc of new-tech-to-solve-well-defined-problem, into the you-can-use-it-everywhere! right into we're-using-it-everywhere!-Even-my-cat-uses-it and then into the gutter of why-on-earth-is-it-being-used-here. I see the same thing with AI to be honest. Amazing idea, finally hit its stride, and now you can't swing said cat without hitting half a dozen products that use AI for no reason other than to have "uses AI" in their marketing (or they really mean they use a Bayesian model or even just basic statistical analysis).
cheers Chris Maunder
-
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
chrisseanhayes wrote:
Is it 'easily' human-readable? Most but not all the time, still useful.
Almost twenty years ago, when XML was super-hype, I was involved in digital library projects. Everyone was praising XML as The Savior, the greatest thing since sliced bread. I went to a Digital Libraries conference: Of the first eleven papers presented, ten was making a big issue of XML adoption being crucial to their project's success... I got XML up to here. Much because of the extreme hype, the total lack of any critical evaluation of its suitability, and stereotypical praise of such "qualities" as "human readable". You don't even need a scheme - the tags are self-documenting! They are? I went to one of Sami speaking guys, to give me a list of Sami language terms for chapter, section, paragraph, table of contents etc., as well as some Sami text, and composed a sample Sami XML document. This I frequently used to illustrate the "readability" of XML. (Note that the Sami culture in Norway is quite strong, and it is certainly to be expected that a digital library receives XML documents according to a scheme specifying Sami tags. Or, if your library handles XML documents of Asian origin, don't be surprised if tags contain, say, Chinese or Thai characters.) A second example I uses involved a 'p' tag. What can we expect it to identify? A paragraph? A part number? A person reference? I could show actual examples of all three interpretations, but made up other possible uses: A point, a position, a product name, a page number ... When you see a 'p' tag, you immediately understand that it has something to do with something relating to the letter p, most likely some concept that starts with 'p' ... in some language. It doesn't have to be English. In an international world, you cannot take for granted that the scheme designer prioritizes readability for native English speakers over readability for the native speakers of the language of the document. This project I was on was focused on long term document archival: As far as possible, a faithful reproduction of the original should be possible fifty years from now, a hundred years, or more. So, from different document format specifications, I collected no less than fourteen different parameters affecting the formatting of a paragraph. Some day many years from now, you are to interpret a document with a lot of 'avsnitt' tags. After some searching, you realize that 'avsnitt' is Norwegian for 'paragra
-
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java
This doesn't hold up, even if only because it seems backwards. If Wikipedia's accurate, work on XML started in 1996, and [became a "W3C recommendation" in 1998](https://en.wikipedia.org/wiki/XML#History), while JSON only started showing up in the "early 2000s" (granted, with some references to work starting in 1999 - but it was still very early in its design by then). And how is JSON in any way "like Java"? One's a data storage file format. The other's a full-blown programming language.
As one of our exercises working in Java we were to take a Json file and convert it to routine that could be compiled in the program. It was to contain data that the program loaded. As I remember from the early 20's it was quite easy.
So many years of programming I have forgotten more languages than I know.
-
chrisseanhayes wrote:
Is it 'easily' human-readable? Most but not all the time, still useful.
Almost twenty years ago, when XML was super-hype, I was involved in digital library projects. Everyone was praising XML as The Savior, the greatest thing since sliced bread. I went to a Digital Libraries conference: Of the first eleven papers presented, ten was making a big issue of XML adoption being crucial to their project's success... I got XML up to here. Much because of the extreme hype, the total lack of any critical evaluation of its suitability, and stereotypical praise of such "qualities" as "human readable". You don't even need a scheme - the tags are self-documenting! They are? I went to one of Sami speaking guys, to give me a list of Sami language terms for chapter, section, paragraph, table of contents etc., as well as some Sami text, and composed a sample Sami XML document. This I frequently used to illustrate the "readability" of XML. (Note that the Sami culture in Norway is quite strong, and it is certainly to be expected that a digital library receives XML documents according to a scheme specifying Sami tags. Or, if your library handles XML documents of Asian origin, don't be surprised if tags contain, say, Chinese or Thai characters.) A second example I uses involved a 'p' tag. What can we expect it to identify? A paragraph? A part number? A person reference? I could show actual examples of all three interpretations, but made up other possible uses: A point, a position, a product name, a page number ... When you see a 'p' tag, you immediately understand that it has something to do with something relating to the letter p, most likely some concept that starts with 'p' ... in some language. It doesn't have to be English. In an international world, you cannot take for granted that the scheme designer prioritizes readability for native English speakers over readability for the native speakers of the language of the document. This project I was on was focused on long term document archival: As far as possible, a faithful reproduction of the original should be possible fifty years from now, a hundred years, or more. So, from different document format specifications, I collected no less than fourteen different parameters affecting the formatting of a paragraph. Some day many years from now, you are to interpret a document with a lot of 'avsnitt' tags. After some searching, you realize that 'avsnitt' is Norwegian for 'paragra
When I say 'human-readable' I mean "it's not a binary file that only a proprietary algorithm can decipher and that you can output the text to your output device of choice and actually read the information therein and with some cognitive overhead understand what is being stored/transmitted" what I don't mean is that it will be like reading Edgar Allen Poe or your favorite Robert Ludlum novel. I mean a human can get in there, do cursory searches and nail down information or even a bug. "Oh, look, John sent a thingabob instead of a hoopadadoop, now I know why the serializer crashed." And again, XML isn't a fixall magic drug that can solve all serialization problems, nor is JSON. Engineering is a set of tradeoffs. This is better than that at this moment in time for this problem. I'm glad I knew about XML at that particular time for that particular need. The same goes for any other serialization. I'm glad I knew about binary serializers, got the info across fast, and in a situation where humans were really never going to need to read the intermediate data anyway.
-
Yes, the discussion certainly drifted a little from "I am losing my mind trying to coerce a class to serialise to XML using the .NET serialisers", and the dumb hacks you do when you just need to get something done and don't actually need the magic the classes provide. XML, to me, followed the classic arc of new-tech-to-solve-well-defined-problem, into the you-can-use-it-everywhere! right into we're-using-it-everywhere!-Even-my-cat-uses-it and then into the gutter of why-on-earth-is-it-being-used-here. I see the same thing with AI to be honest. Amazing idea, finally hit its stride, and now you can't swing said cat without hitting half a dozen products that use AI for no reason other than to have "uses AI" in their marketing (or they really mean they use a Bayesian model or even just basic statistical analysis).
cheers Chris Maunder
I like your reply. If there was ever a red flag about some technology, to use or not use or abuse, it's those knee-jerk reactions to all flood one side of an argument like a holy war. Engineering is really just a bunch of tradeoffs in favor of the most 'optimum' solution at the time and available smarts. Just look what they did with a simple adjective like 'agile' and all those tussles over to normalize data or to not. Thanks for this lively discussion.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Use DOM API to build your object or for parsing. It is verbose, but simple. String building will always burn you later. We sometimes utilize a cousin of XML called BobML. Differences from XML - No attributes - No escaping of 5 special chars Verbose but simple to generate and parse via string functions into values and lists. Developed by Bob of course!
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Trust me, it can get worse. I write plugins for an application that stores objects into XML files insisting on using the BOM flag and failing to correctly translate different string types from one format to another* when you try to use the methods intended for exactly that purpose. I thought using a library like xercesc would save me from dealing with the finer details, but in the end I needed to stream the already written xml file I created into another and insert that BOM flag manually - because for some reason Xerxesc ignored my telling him to do it for me... Of course, all that requires to first find out there is such a thing like a BOM flag which isn't immediately obvious when you look at your created XML file with some standard editor. Then you need an editor that actually knows the difference. Then you need to know where to look that you actually have a BOM flag (or not), then you need to duckduckgo (or google if you prefer) for whatever that flag is, and how it's coded, and then you need to find out how in the world you get it written! *choose any two from: utf8, utf16, wchar_t*, CString (MFC), XMLCh* (which is simply an unsigned short "string")
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)