Why is XML?
-
Chris Maunder wrote:
as something that is not seen or edited by humans
I could've sworn when I first started reading about XML, it was being sold based on the idea that it was trivially easy for people to read and write.
Ah, marketing...
cheers Chris Maunder
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Hungary's National Tax and Customs Agency requires real-time XML invoice reporting[^] , so we do it. It requires the schema designer to know his/her art, because xsd.exe[^] can choke on
and
type brainless design. Other pain was that while the XML standard is happy with a default namespace, XPath requires a prefix[^].
-
Xml wasn't originally written for web service data transfer or serialization/deserialization. It was written by the W3c to replace Html but still be Html-like. Xml is a Mark-up language, hence it has mark-up. Mark-up makes it good for readability by humans but also a standard readable by machines. Xml was then hi-jacked to be used by SOAP web services with serialization/deserialization. Then someone realized that json was better for serialization/deserialization, especially since readability by both humans and machines wasn't necessary, it only needs to be read by machines. JSON also has room for improvement in verbosity and as soon as a good replacement exists, people will say the same things: why json when new-thing is better.
That brought back XHTML nightmares...
cheers Chris Maunder
-
Ah, marketing...
cheers Chris Maunder
-
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java. Kid of like the same reason that C# exists. I think he was being sarcastic but am not totally sure. He did show that history of the two and which came first is debatable. Both have roots that run back a long, long time ago.
So many years of programming I have forgotten more languages than I know.
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java
This doesn't hold up, even if only because it seems backwards. If Wikipedia's accurate, work on XML started in 1996, and [became a "W3C recommendation" in 1998](https://en.wikipedia.org/wiki/XML#History), while JSON only started showing up in the "early 2000s" (granted, with some references to work starting in 1999 - but it was still very early in its design by then). And how is JSON in any way "like Java"? One's a data storage file format. The other's a full-blown programming language.
-
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java. Kid of like the same reason that C# exists. I think he was being sarcastic but am not totally sure. He did show that history of the two and which came first is debatable. Both have roots that run back a long, long time ago.
So many years of programming I have forgotten more languages than I know.
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java.
Then your professor, by "XML exists", presumably meant "XML did not die" rather than "XML was created". XML predates the first JSON RFC by ten years. And, XML was in use for several years before it was formally standardized. I really do not see how Microsoft gets into this. MS certainly neither defined XML nor JSON. I never saw Microsoft as a very active promoter of XML. C# was created by MS. I am not (yet) able to find on the net any documentation of the MS/Sun controversy, but twenty years ago "everybody knew" that C# was a response to Sun not allowing MS to use Java as it wanted. (If my memory is correct, MS wanted to add language features that Sun did not approve of.) So C# is a very different story from XML/Json. XML syntax borrows a lot from far older formats: Typesetting systems of the late 70s (maybe even older) used the same style bracketed keywords, e.g. to delimit paragraphs and specify paragraph formatting. You can see a selection of such tags e.g. in the 1982 Historical Manuals: Guide to Typesetting at the UKCC[^], at page 14-15. In the typesetting systems I ever touched, the brackets were displayed as common brackets, but had a different internal representation, and distinct keys on the dedicated terminals. So there was no need for escaping or other special handling of the common brackets (or math smaller/greater than).
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Do you remember what we had before XML? Talk about misery! When I joined here, I was working at an Ace Hardware, and trying to get our in house system to integrate with the Ace Corporate online ordering system (no Internet then, direct dialup connection) required the patience of Job, along with a love of self abuse. I'm still grateful for XML!
Will Rogers never met me.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
-
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
Yes, the discussion certainly drifted a little from "I am losing my mind trying to coerce a class to serialise to XML using the .NET serialisers", and the dumb hacks you do when you just need to get something done and don't actually need the magic the classes provide. XML, to me, followed the classic arc of new-tech-to-solve-well-defined-problem, into the you-can-use-it-everywhere! right into we're-using-it-everywhere!-Even-my-cat-uses-it and then into the gutter of why-on-earth-is-it-being-used-here. I see the same thing with AI to be honest. Amazing idea, finally hit its stride, and now you can't swing said cat without hitting half a dozen products that use AI for no reason other than to have "uses AI" in their marketing (or they really mean they use a Bayesian model or even just basic statistical analysis).
cheers Chris Maunder
-
IDKW people are bad-mouthing XML OR barking that JSON is the magic carpet of serialization. XML was an attempt by some fairly smart people (smarter than me) to make a standardized, text-based, human-readable serialize standard. Is it 'easily' human-readable? Most but not all the time, still useful. Can you just open it in a text editor and read it? Yes, didn't say you'd enjoy the experience but you could, can, and do. Is it a good standard? Well, people are still using it, today, it works, and 'works' gives programmers and interested parties (those with the money) this warm fuzzy feeling inside. And consider this, you can make some REALLY simple XML, read, serialize, transmit it, and you can make some seriously complex XML, read, serialize, and transmit that. Things you couldn't imagine serializing in JSON. That's what the X stands for Extensible. Nevertheless, just like someone else said on this post, EVERYONE IS USING XML EVERY DAY, it's called the internet, it's called HTML which, really deep down, is XML that transmits some of the greatest amounts of information around the world billions and billions of times over, and it just works, not perfect, but it does.
chrisseanhayes wrote:
Is it 'easily' human-readable? Most but not all the time, still useful.
Almost twenty years ago, when XML was super-hype, I was involved in digital library projects. Everyone was praising XML as The Savior, the greatest thing since sliced bread. I went to a Digital Libraries conference: Of the first eleven papers presented, ten was making a big issue of XML adoption being crucial to their project's success... I got XML up to here. Much because of the extreme hype, the total lack of any critical evaluation of its suitability, and stereotypical praise of such "qualities" as "human readable". You don't even need a scheme - the tags are self-documenting! They are? I went to one of Sami speaking guys, to give me a list of Sami language terms for chapter, section, paragraph, table of contents etc., as well as some Sami text, and composed a sample Sami XML document. This I frequently used to illustrate the "readability" of XML. (Note that the Sami culture in Norway is quite strong, and it is certainly to be expected that a digital library receives XML documents according to a scheme specifying Sami tags. Or, if your library handles XML documents of Asian origin, don't be surprised if tags contain, say, Chinese or Thai characters.) A second example I uses involved a 'p' tag. What can we expect it to identify? A paragraph? A part number? A person reference? I could show actual examples of all three interpretations, but made up other possible uses: A point, a position, a product name, a page number ... When you see a 'p' tag, you immediately understand that it has something to do with something relating to the letter p, most likely some concept that starts with 'p' ... in some language. It doesn't have to be English. In an international world, you cannot take for granted that the scheme designer prioritizes readability for native English speakers over readability for the native speakers of the language of the document. This project I was on was focused on long term document archival: As far as possible, a faithful reproduction of the original should be possible fifty years from now, a hundred years, or more. So, from different document format specifications, I collected no less than fourteen different parameters affecting the formatting of a paragraph. Some day many years from now, you are to interpret a document with a lot of 'avsnitt' tags. After some searching, you realize that 'avsnitt' is Norwegian for 'paragra
-
michaelbarb wrote:
My professor said that XML exists because Microsoft was afraid of being sued because JSON was to much like Java
This doesn't hold up, even if only because it seems backwards. If Wikipedia's accurate, work on XML started in 1996, and [became a "W3C recommendation" in 1998](https://en.wikipedia.org/wiki/XML#History), while JSON only started showing up in the "early 2000s" (granted, with some references to work starting in 1999 - but it was still very early in its design by then). And how is JSON in any way "like Java"? One's a data storage file format. The other's a full-blown programming language.
As one of our exercises working in Java we were to take a Json file and convert it to routine that could be compiled in the program. It was to contain data that the program loaded. As I remember from the early 20's it was quite easy.
So many years of programming I have forgotten more languages than I know.
-
chrisseanhayes wrote:
Is it 'easily' human-readable? Most but not all the time, still useful.
Almost twenty years ago, when XML was super-hype, I was involved in digital library projects. Everyone was praising XML as The Savior, the greatest thing since sliced bread. I went to a Digital Libraries conference: Of the first eleven papers presented, ten was making a big issue of XML adoption being crucial to their project's success... I got XML up to here. Much because of the extreme hype, the total lack of any critical evaluation of its suitability, and stereotypical praise of such "qualities" as "human readable". You don't even need a scheme - the tags are self-documenting! They are? I went to one of Sami speaking guys, to give me a list of Sami language terms for chapter, section, paragraph, table of contents etc., as well as some Sami text, and composed a sample Sami XML document. This I frequently used to illustrate the "readability" of XML. (Note that the Sami culture in Norway is quite strong, and it is certainly to be expected that a digital library receives XML documents according to a scheme specifying Sami tags. Or, if your library handles XML documents of Asian origin, don't be surprised if tags contain, say, Chinese or Thai characters.) A second example I uses involved a 'p' tag. What can we expect it to identify? A paragraph? A part number? A person reference? I could show actual examples of all three interpretations, but made up other possible uses: A point, a position, a product name, a page number ... When you see a 'p' tag, you immediately understand that it has something to do with something relating to the letter p, most likely some concept that starts with 'p' ... in some language. It doesn't have to be English. In an international world, you cannot take for granted that the scheme designer prioritizes readability for native English speakers over readability for the native speakers of the language of the document. This project I was on was focused on long term document archival: As far as possible, a faithful reproduction of the original should be possible fifty years from now, a hundred years, or more. So, from different document format specifications, I collected no less than fourteen different parameters affecting the formatting of a paragraph. Some day many years from now, you are to interpret a document with a lot of 'avsnitt' tags. After some searching, you realize that 'avsnitt' is Norwegian for 'paragra
When I say 'human-readable' I mean "it's not a binary file that only a proprietary algorithm can decipher and that you can output the text to your output device of choice and actually read the information therein and with some cognitive overhead understand what is being stored/transmitted" what I don't mean is that it will be like reading Edgar Allen Poe or your favorite Robert Ludlum novel. I mean a human can get in there, do cursory searches and nail down information or even a bug. "Oh, look, John sent a thingabob instead of a hoopadadoop, now I know why the serializer crashed." And again, XML isn't a fixall magic drug that can solve all serialization problems, nor is JSON. Engineering is a set of tradeoffs. This is better than that at this moment in time for this problem. I'm glad I knew about XML at that particular time for that particular need. The same goes for any other serialization. I'm glad I knew about binary serializers, got the info across fast, and in a situation where humans were really never going to need to read the intermediate data anyway.
-
Yes, the discussion certainly drifted a little from "I am losing my mind trying to coerce a class to serialise to XML using the .NET serialisers", and the dumb hacks you do when you just need to get something done and don't actually need the magic the classes provide. XML, to me, followed the classic arc of new-tech-to-solve-well-defined-problem, into the you-can-use-it-everywhere! right into we're-using-it-everywhere!-Even-my-cat-uses-it and then into the gutter of why-on-earth-is-it-being-used-here. I see the same thing with AI to be honest. Amazing idea, finally hit its stride, and now you can't swing said cat without hitting half a dozen products that use AI for no reason other than to have "uses AI" in their marketing (or they really mean they use a Bayesian model or even just basic statistical analysis).
cheers Chris Maunder
I like your reply. If there was ever a red flag about some technology, to use or not use or abuse, it's those knee-jerk reactions to all flood one side of an argument like a holy war. Engineering is really just a bunch of tradeoffs in favor of the most 'optimum' solution at the time and available smarts. Just look what they did with a simple adjective like 'agile' and all those tussles over to normalize data or to not. Thanks for this lively discussion.
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Use DOM API to build your object or for parsing. It is verbose, but simple. String building will always burn you later. We sometimes utilize a cousin of XML called BobML. Differences from XML - No attributes - No escaping of 5 special chars Verbose but simple to generate and parse via string functions into values and lists. Developed by Bob of course!
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Trust me, it can get worse. I write plugins for an application that stores objects into XML files insisting on using the BOM flag and failing to correctly translate different string types from one format to another* when you try to use the methods intended for exactly that purpose. I thought using a library like xercesc would save me from dealing with the finer details, but in the end I needed to stream the already written xml file I created into another and insert that BOM flag manually - because for some reason Xerxesc ignored my telling him to do it for me... Of course, all that requires to first find out there is such a thing like a BOM flag which isn't immediately obvious when you look at your created XML file with some standard editor. Then you need an editor that actually knows the difference. Then you need to know where to look that you actually have a BOM flag (or not), then you need to duckduckgo (or google if you prefer) for whatever that flag is, and how it's coded, and then you need to find out how in the world you get it written! *choose any two from: utf8, utf16, wchar_t*, CString (MFC), XMLCh* (which is simply an unsigned short "string")
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
-
Do you remember what we had before XML? Talk about misery! When I joined here, I was working at an Ace Hardware, and trying to get our in house system to integrate with the Ace Corporate online ordering system (no Internet then, direct dialup connection) required the patience of Job, along with a love of self abuse. I'm still grateful for XML!
Will Rogers never met me.
Roger Wright wrote:
Do you remember what we had before XML?
Well ... What if I do? I was studying ASN.1 in the very early 1980s. The scheme is mandatory; the legal constructions are always specified. Great! It is abstract: ASN.1 specifies the logical structure of files/documents, with no concern for a specific representation or format. Great! An ASN.1 document/file may be represented in a handful well specified, clearly identified concrete encodings - functionally 100% identical; you may read in one encoding and write back in another encoding, with no loss of information. Great! You must have access to the scheme, which ensures a proper interpretation with no guesswork. You know what you get. Great! A data stream not adhering to the scheme is like a transmission ruined by noise: It is worth nothing. Any non-ruined document honors the ADN.1 scheme. Unconditionally. Great! The 'Tag' part is binary. You display it to the user e.g. by mapping it to local language terms (Since you must have access to the scheme, you have an opportunity to set up a meaningful mapping) - I worked with a handful applications providing mappings of tags to several different languages. Great! The representation essentially being binary required the use of an ASN.1 data editor - vi wouldn't suffice. As a result, you never forgot to add the closing tag (there was no closing tag). Great! You never misspelled a tag name, but selected from those allowed by the scheme. Great! You never got the length wrong - that was handled by the ASN.1 editor and concrete coder. Great! The format was space efficient (although dependent on the encoding), BER excessively so, according to some critics. (Some other concrete encodings, such as the XML encoding, could be quite wasteful, though.) Great! The 'Value' part was completely unrestricted, a binary blob, with no need for escapes, quoting or anything resembling 'character entities'. (Or Base64, QP, AtoB/BtoA, BinHex, UUencode, or whathaveyou.) Great! If you wanted to edit an ASN.1 document, you had to use an ASN.1 editor that made sure that the scheme was honored; the document couldn't be arbitrarily tampered with outside scheme control. Great! I sure miss both the abstract ASN.1 side, and the BER encoding.
-
That's what killed me: I have a working implementation in Json but needed (evidently) to have it work in XML too. Json: it just worked. Next? XML: my life is a miserable series of pointless failures
cheers Chris Maunder
-
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
Best regards,
to your question "is anyone actively using XML as a data transport format?" I hope the answer is negative, because it is easier for the transport of data to use JSON. ;)
Another thing is to continue using XML vocabularies, e.g. UBL from oasis-open.org, to generate electronic documents, which have dematerialized commercial documents and especially commercial documents for electronic businesses; the evidence of the commercial transaction is the XML object, and this evidence is regulated and recognized by the authorities of the countries, who give legal effects to the artifacts and trust in their use to the market when they use digital signatures that are also legally recognized. - Colombia uses ubl-Invoice and other documents, digitally signed, as securities that circulate in the market for the sale of "discounted invoices", a type of factoring, and authenticity, integrity and non-repudiation are protected by the legislation, and the availability is protected by a State entity through public storage services and an approach to the time-stamp; in the future these dematerialized documents will be registered and made available in decentralized blockchain services, and rules will be designed in smart contracts that will be based on conditions written in a formal XML document, stored in the syntax and semantics of the chosen vocabulary, and thanks To techniques such as XAdES-EPES from etsi.org, the Merkle trees of the b lockchain will be reinforced with the cryptographic summaries (sha-2, sha-3,…) used for the digital signature, and the PKI + PKC of the blockchain nodes will be able to be reinforced by means of digital certificates from CAs recognized in legal jurisdictions, well defined. - Colombia and other countries of the American continent, and the European Union process many millions of XML documents daily, with vocabularies or with their own XSD schemas. -
Just why. I spent an hour fighting with XmlSerialisers to try and get my object mapped to a schema. Changing names, trying to get attributes setup, dealing with CDATA. I gave up. I got so fed up I simply wrote the XML directly as a raw string. If I could have kicked it I would have kicked it. I totally get the beauty of having data in a class and throwing it at different serialisers and having it Just Work. Switch between XML and Json and maybe binary and text and build out this whole massive ecosystem that screams "I'm trying to do too much!". But dear lord. It's like root canal surgery. Is anyone actively using XML as a data transport format? I get that we all use it in things like XAML and ASP.NET pages and the whole HTML thing, but as something that is not seen or edited by humans, that needs to be cognizant of bandwidth, is it still being used in that manner or am I just really, really intolerant this morning?
cheers Chris Maunder
If you start with an XML or XSD file there is a command line tool to generate code. XSD.exe. https://docs.microsoft.com/en-us/dotnet/standard/serialization/xml-schema-def-tool-gen I have used this many times with little hassle.
Alan
-
Some guy somewhere had a bad dream and woke up with the idea: now how can I make something totally confusing and complicated which computers can read effortlessly but humans will find totally incomprehensible? He came up with XML and ticked all the necessary boxes/requirements perfectly. Personally I am not a fan of javascript but boy did they get that JSON stuff right. Whatever programming language you care to use the JSON data exchange is dead easy to follow and debug. Leave the hard interpreting stuff to computers, not humans. For god's sake: that is why we designed them for !!!