The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

JSON cp XML and other interchange formats: why so verbose?

Page 1 of 2Next

(There was a previous discussion about JSON here.)

JSON "uses human-readable text to store and transmit data objects" says the wiki. In contrast XML (or CSV?) are alleged to be non-human-readable.

But would any human try to read any of these formats as text? Wouldn't they use some sort of widget in their browser? (Or Excel for CSV.)

In practice in the corporate world, JSON formats seems to use VeryLongAndPernicketlyPreciseMultisegmentNames for everything, which renders them human unreadable anyway. (I guess they want each name to be unique globally across the whole enterprise and any possible outside body they might interact with.)

I get it that JSON might be used for transmitting ad-hoc a part of a transaction, not necessarily a whole table or whole record eligible for a database table.

Is it common to transmit a schema in front of JSON data? Or at least some sort of identifier for the schema/version so the receiving end gets a heads up whether it'll be able to interpret the content.

JSON claims to support field type-safety better than XML(?) but having a valid number in a place there's supposed to be a date, or all together missing out the date field from a transaction is not what I'd call 'type safe'.

< Insert old lag rant about the industry seeming to move two steps back for every step forward. Just because comms bandwidth has grown enormously doesn't mean we have to fill it all up. >

Anthony,

I've had plenty of experience with JSON at a variety of jobs and/or open source projects, and a lot of the things you are claiming are attributed to JSON, I don't see anyone making those claims.

I see no one saying JSON is more type safe, or more human readable, than XML, or anything else.  Where do you see anyone claiming this.

If anything, JSON is expressly very unstructured and practically everyone uses it however they want, same as XML, with higher levels of abstraction being independent of the format.

It is true that JSON is less verbose than XML, and it has distinct formats for booleans and numbers rather than it all being strings, but that's about as far as it goes vs XML.

The most common usage of JSON is for small snippets of structured data in client-server interactions, such as with web apps or mobile/phone apps, for remote procedure calls and such.  Its also popular with config files.

I have seen no examples of using VeryLongNames in JSON, if anything all the names tend to be similar in length to what you may name tuple attributes in a database.

It is NOT common to transmit a schema in front of JSON data.  Usually the meaning of the data is understood implicitly by both parties, such as by the fact that one is invoking a particular REST API which expects a particular input and gives a particular output.

Or discoverability is provided by means of a separate API call.

Quote from AntC on May 27, 2022, 2:19 am

(There was a previous discussion about JSON here.)

JSON "uses human-readable text to store and transmit data objects" says the wiki. In contrast XML (or CSV?) are alleged to be non-human-readable.

But would any human try to read any of these formats as text? Wouldn't they use some sort of widget in their browser? (Or Excel for CSV.)

In practice in the corporate world, JSON formats seems to use VeryLongAndPernicketlyPreciseMultisegmentNames for everything, which renders them human unreadable anyway. (I guess they want each name to be unique globally across the whole enterprise and any possible outside body they might interact with.)

I get it that JSON might be used for transmitting ad-hoc a part of a transaction, not necessarily a whole table or whole record eligible for a database table.

Is it common to transmit a schema in front of JSON data? Or at least some sort of identifier for the schema/version so the receiving end gets a heads up whether it'll be able to interpret the content.

JSON claims to support field type-safety better than XML(?) but having a valid number in a place there's supposed to be a date, or all together missing out the date field from a transaction is not what I'd call 'type safe'.

< Insert old lag rant about the industry seeming to move two steps back for every step forward. Just because comms bandwidth has grown enormously doesn't mean we have to fill it all up. >

It's just a recognisable file format. It works nicely for some things, not others. Properly-formatted JSON is quite readable, but I prefer YAML.

Sounds like some abuses of JSON there, much like abuses of everything by incompetents.

I'm fond of long names, because AReallyLongNameCanBeSelfDescribing, unlike ARLNCBSD or whatever.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

I also use JSON a lot. It supports (but does not enforce) all the expected predefined types well enough and comes with enough pre-written libraries that you can always get typed data from over here to over there. It's almost always a better choice than CSV or plain text.

Three major failings: no date/time, no comments, no enums. They should fix that. You have to fake them with strings or objects and pray a bit.

No schema, by design. Just use XML if that's your bag.

Corporate JSON is a contradiction in terms. Just use XML.

I don't like YAML, the spec is just too hard and there are too many gotchas. I believe later versions are working on that. I have used it for hand-written config files and init data, and it's standard with Rails. For data interchange, just use JSON.

Andl - A New Database Language - andl.org
Quote from dandl on May 29, 2022, 1:11 am

I also use JSON a lot. It supports (but does not enforce) all the expected predefined types well enough and comes with enough pre-written libraries that you can always get typed data from over here to over there. It's almost always a better choice than CSV or plain text.

Three major failings: no date/time, no comments, no enums. They should fix that. You have to fake them with strings or objects and pray a bit.

No schema, by design. Just use XML if that's your bag.

Corporate JSON is a contradiction in terms. Just use XML.

I don't like YAML, the spec is just too hard and there are too many gotchas. I believe later versions are working on that. I have used it for hand-written config files and init data, and it's standard with Rails. For data interchange, just use JSON.

Indeed, XML or JSON are preferable for machine-to-machine data transfer.

For human-to-machine data transfer, I like YAML over JSON and XML.

"Corporate JSON" may be a contradiction in terms, but it's common in the enterprise IT world. So is YAML.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

By 'corporate JSON' I mean more or less what OP was complaining about: layers of standards, conventions, compliance monitoring, versioning, etc. Often the best answer to 'how do I get JSON to do this' is don't, use something else. XML or Protocol buffers or whatever. No shortage of solutions out there looking for a problem.

Andl - A New Database Language - andl.org
Quote from AntC on May 27, 2022, 2:19 am

Is it common to transmit a schema in front of JSON data? ...

JSON claims to support field type-safety better than XML(?) but having a valid number in a place there's supposed to be a date, or all together missing out the date field from a transaction is not what I'd call 'type safe'.

< Insert old lag rant about the industry seeming to move two steps back for every step forward. Just because comms bandwidth has grown enormously doesn't mean we have to fill it all up. >

Schema : no, that would very much defeat JSON's purpose of being "more concise" than XML

Type-safety : yes, better than XML, but that's because XML taken on its own is not the same thing as XML+XSD.

Comms bandwidth : reminds me of a boss of mine in my first job to whom I made a remark that the solution to our problem at hand was buying more disks.  "Pointless", says he, "because as soon as it's there it'll just get used."

And BTW, the only way to make machine-to-machine data exchange 100% reliable and robust, is to write out the spec ***in full***.  Domains (as in sets of permitted values), constraints (as in rules that further constrain which combinations of values can and cannot appear), everything.  And if one bothers to spend the time to do that (no one does, and it shows), "PernicketlyPreciseAndInFull", you can just as well complete the job and write similar pernicketly precise spec of the physical encodings too, right down to the bare bits.  No solution built on lexing strings is ever going to beat that.

In a previous job I worked with EDIFACT, which was oh, so wonderful, it had technical specifications for standardized messages for every application domain. Just implement and you can communicate with anybody. Of course, for your help you could purchase expensive software and expensive consulting services that could validate the messages for you.

Imagine the disappointment when it turned out that even when you both were using an alphanumerical code of length 3 as per the standard, you still couldn't understand the other party's codes!

It gets worse. So now we are wiser and we define that the meaning of a certain timestamp is "The time at which goods entered the customs territory". Turns out that it still doesn't help a whole lot because in one country (guess which) "entering the customs territory" means crossing an imaginary line x nautical miles off the coast, while in another country it means when the little lead seal on the container was broken inside the country's borders.

The point of the story is that you need to validate the data on the level where it is meaningful, which is ever only in the precise business domain. All validation on technical levels is pretty much pointless.

XML started out nice and simple, although James Clark later realized that it should have been even simpler because DTDs were completely useless and counterproductive. The point of the format was to be able to communicate information in very free form, even semi-structured data which is now being utilized in embedded xbrl, where you create beautiful documents for your shareholders and the relevant data are tagged to be machine-extractable. But the snake-oil salesmen had to come up with XSD and such, which is equally pointless to validate as the technical specifications of EDIFACT.

So developers came up with JSON, which (horror of security horrors) was originally parsed by a simple "eval" in javascript, no snake-oil needed. It isn't quite as good at semi-structured data as xml and is a bit clunky for metadata, but it works. Don't hold your breath, though, there are people working on JSON schemas as we speak, which will again be pretty pointless.

In other news, I hear that ASN.1 is experiencing a revival in IoT and mobile communication.

Of course, you need to know how to serialize and deserialize your data to and from bits, possibly with text on the way, but it is only in your application that the data can be valid or non-valid, and that should be painstakingly mapped out. Here's a good way to recreate meaning at the boundary

(And on the subject of YAML I have only one comment: significant whitespace, really???)

Quote from tobega on May 31, 2022, 1:47 pm

...

(And on the subject of YAML I have only one comment: significant whitespace, really???)

And in Python too, of course.

Perhaps enough tools use/require/promote YAML that I've become inured to the abuse, but I'm ok with it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

JSON also as a problem where it doesn't have just a single fully defined spec, so there are various gotchas and corner cases and different implementations behaving differently, in some respects.

See http://seriot.ch/parsing_json.php and https://github.com/nst/JSONTestSuite for more about this.

Page 1 of 2Next