Making data notation extending another language
-
JSON and XML and YAML and ... Isn't the whole bunch of them wheel reinventions? When everybody else are creating new wheels which are better suited for the purpose than all the old ones, why shouldn't I do the same? :-) Now I have personally come to one conclusion, in particular from many years of exposure to XML: Data description languages are for computers, not for humans. This kind of stuff you, a human, do not handle better than a computer does. You make typos, you do not structure it according to the rules, in brief: You mess it up. So keep humans out of it! The best way of doing that is to make it unreadable. Binary. I know that is a highly Politically Uncorrect statement; yet I think that what humans should not mess up, should not be made available for messing up - especially not with as simple tool as a plain text editor. You can also mess up by using a binary generator (/editor), but that takes a lot more deliberate action. The mess comes from "You asked for it, you got it" - not from "Ooooops!" So when I need to store data for my own applications (and there are no requirements for sharing the data files with other applications), I do it as binary files. Always in a Tag-Length-Value format, evading all sorts of escape mechanisms. No need to search for the end of the field. Arbitrary binary data. Space allocation for the value can be made before it is actually read. Parsing the file is extremely fast. The space overhead is quite moderate. Details of how you do the TLV format may vary slightly. E.g. in some applications, there will never be more than a couple hundred distinct tags, so it is stored in 15 bits; the "sign bit" is a flag indicating that the Value is in fact a sequence of TLV values. If values are small, the length is 16 bits, too. If there is any risk at all of overflow, I use the BER style of variable length handling: The length of the enclosing TLV is 0, each member carries its own length; the member sequence is terminated by and all zero TLV. (Then you cannot preallocate space for the entire structure without reading it, but usually a composite value won't be stored as a single unit anyway.) Like all class definitions have a ToString, they have a ToTLV. And a FromTLV. The "Schema" is represented by these ToTLV funcitons. If any other application needs data in, other formats, adding ToXML, ToJSON, ToYAML, ... alongside with ToString and ToTLV is straigtforward. But for the application's private file, the binary ToTLV is used.
While binary format described by you is interesting it's not what I asked about. I'll try creating one in the future nevertheless.
-
XML is very verbose and JSON doesn't have extendable types.
XML existed before JSON. And data interchange formats benefit from being verbose. Due to readability; it's not a binary format. Come to the point please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
XML existed before JSON. And data interchange formats benefit from being verbose. Due to readability; it's not a binary format. Come to the point please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
How does "XML existed before JSON" relate to either "XML is very verbose" or "JSON doesn't have extendable types"? In which ways do "data interchange formats benefit from being verbose"? Most users today do not read the raw data interchange format directly, as-is - they process it by software that e.g. highlights labels, closing tag etc, and allow collapsing of substrucures. When you pass it through software anyway, what impact on readability does the format of the input to this display processor have? With semantically identical information, but binary coded, as input to the display processor, why would the readabilty be better with a character encoding of the information rather than by a binary encoding?
-
XML existed before JSON. And data interchange formats benefit from being verbose. Due to readability; it's not a binary format. Come to the point please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
They are not good enough so I won't use it.
-
How does "XML existed before JSON" relate to either "XML is very verbose" or "JSON doesn't have extendable types"? In which ways do "data interchange formats benefit from being verbose"? Most users today do not read the raw data interchange format directly, as-is - they process it by software that e.g. highlights labels, closing tag etc, and allow collapsing of substrucures. When you pass it through software anyway, what impact on readability does the format of the input to this display processor have? With semantically identical information, but binary coded, as input to the display processor, why would the readabilty be better with a character encoding of the information rather than by a binary encoding?
Semantical bullshit, aka wordsmithing. I been on that train before. You trying to do as if binary is the solution to formats; it's not. Anything, text or date, is stored as bits, and is thus in binary. ASCII is a representation of that, UTF is a better form of ASCII. Dates are stored as floats. I don't care what university. You can either learn or be rediculed. And damn right I will, at every opportunity. And yes, being "kind" :suss:
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
They are not good enough so I won't use it.
:D They might not be efficient to you; but lots of us use them, both, where appropriate. Try to explain why XML isn't good enough, and to how many floppy-discs you're limited to that you need that optimization. Do elaborate, please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
Semantical bullshit, aka wordsmithing. I been on that train before. You trying to do as if binary is the solution to formats; it's not. Anything, text or date, is stored as bits, and is thus in binary. ASCII is a representation of that, UTF is a better form of ASCII. Dates are stored as floats. I don't care what university. You can either learn or be rediculed. And damn right I will, at every opportunity. And yes, being "kind" :suss:
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
:D They might not be efficient to you; but lots of us use them, both, where appropriate. Try to explain why XML isn't good enough, and to how many floppy-discs you're limited to that you need that optimization. Do elaborate, please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
You have several times in this thread more or less insisted on relating to (7-bit) ASCII and floppy disks. Noone else her care about either of those. If they are your frame of reference, then refer your experience to them. I don't care to. And I don't think it the effort to explain why not will be justified. I am not (and I guess there are a few others agreeing) are not demanding of you that you critically assess you choice of data formats and other solutions. You may go on as you please, with the formats that pleases you, with or without any critical evaluation. You are welcome.
-
If you really want me to explain to you the difference between storing an integer, say, as a 32 bit binary number vs. storing it as a series of digit characters, bedayse "ASCII is bits, hence digital", then I give up. Sorry.
Member 7989122 wrote:
If you really want me to explain to you the difference between storing an integer, say, as a 32 bit binary number vs. storing it as a series of digit characters
I didn't say that; and not going to explain either. I've no need to, nor any desire.
Member 7989122 wrote:
then I give up. Sorry.
:) Good timing. And please do.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
You have several times in this thread more or less insisted on relating to (7-bit) ASCII and floppy disks. Noone else her care about either of those. If they are your frame of reference, then refer your experience to them. I don't care to. And I don't think it the effort to explain why not will be justified. I am not (and I guess there are a few others agreeing) are not demanding of you that you critically assess you choice of data formats and other solutions. You may go on as you please, with the formats that pleases you, with or without any critical evaluation. You are welcome.
Not with or without critical evaluation, but an education. One expects that a developer knows the different text-formats (and encodings, which is the same to you), data-formats, and date-formats. One who mixes those in a semantical bullshit argument gets called out. So damn right I will. Either play your cards or fold.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
XML is very verbose and JSON doesn't have extendable types.
-
:D They might not be efficient to you; but lots of us use them, both, where appropriate. Try to explain why XML isn't good enough, and to how many floppy-discs you're limited to that you need that optimization. Do elaborate, please.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
I don't mean that I won't use XML/JSON. I think they are not good enough so I still want to create my data notation. It's just me saying that this is off topic (I used stackexchange sites before) and I just don't want to discuss it any farther (as it doesn't bring anything to my first question).
-
What does that have to do with anything? I merely pointed out that there are two existing, well tried and widely supported systems for data interchange. You can use them or not as you choose.
Well, pointing XML/JSON was off topic as well.
-
Well, pointing XML/JSON was off topic as well.
-
I want to create data notation (like JSON is used). 1) Is it good enough to use languages built-in features (types, notation etc), extend it (e.g. with another types), and output some JSON? 2) Or should I build it from scratch and parse all built-in features and add my additions then output it into JSON? By using 1) I don't have to implement core things. If there are fixes - then it's good. If there are changes that I don't like I can deal with them from case to case... I guess. Howerer I'm tied to a programming language - so users had to use the programming language (instead library). By using 2) I have to build everything but I'm not tied to one particular language. Maybe I can mix it ( 1) for the language, 2) for other languages). What are your toughts on this topic. ps. I was thinking about using the Red ( red-lang.org/ ). It's in alpha but I don't think it will change a lot.
Are you looking for something like protobuf?
Protocol Buffers | Google Developers[^]:
Protocol buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
nedzadarek wrote:
I want to create data notation (like JSON is used).
So your mention of JSON in your original question was off topic?
I don't want to waste time on your trolling.
-
I don't want to waste time on your trolling.
-
Are you looking for something like protobuf?
Protocol Buffers | Google Developers[^]:
Protocol buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
How extensible are they? As far I can see they are at a "structure level" (I'm not sure if there is proper term for it; by "structure level" I mean extending some language with structures like in C in place of a type (joining few types together), for example (pseudocode): `qux: struct {foo: string, baz: integer}; qux new-variable = struct {foo: "***", baz: 42}`) or are they extensible at deeper level (parsing types, e.g. `new-type: <"-">; new-type new-variable = 2-3)?
-
How extensible are they? As far I can see they are at a "structure level" (I'm not sure if there is proper term for it; by "structure level" I mean extending some language with structures like in C in place of a type (joining few types together), for example (pseudocode): `qux: struct {foo: string, baz: integer}; qux new-variable = struct {foo: "***", baz: 42}`) or are they extensible at deeper level (parsing types, e.g. `new-type: <"-">; new-type new-variable = 2-3)?
You having trouble extending a text-format? Whatever uni you represent, I'll come take a piss on them. I'll even pay for it myself.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
You having trouble extending a text-format? Whatever uni you represent, I'll come take a piss on them. I'll even pay for it myself.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
Eddy. You're Dutch, not German, so you have no business paying for scat fetishes. ;P
Robust Services Core | Software Techniques for Lemmings | Articles