Intro to XML and JSON #7: Recap & Real World Use
26 Apr 2019
That’s a wrap! Let’s recap the battle of “XML vs. JSON,” discuss how they’re used in the real world, and take a final glance at the big picture.
After all, you might still be asking questions like:
- Which is better?
- When will I really use XML or JSON?
Posts In This Series
- Part 1 - Intro to XML and JSON
- Part 2 - Intro to XML and JSON #2: Data's Shape
- Part 3 - Intro to XML and JSON #3: XML Items & Keys
- Part 4 - Intro to XML and JSON #4: XML Values
- Part 5 - Intro to XML and JSON #5: XML/CSV Conversions
- Part 6 - Intro to XML and JSON #6: JSON
- Part 7 - This Article
- Part 8 - Intro to XML, JSON, & YAML: the book
Viewing “Pretty” XML & JSON
We’ll have a lot of examples in this series. I recommend that you edit them and play with seeing them in a “pretty” format! XML, JSON – paste & click “Tree View”.
Warning: only put sample data into the “beautifier” links above. Never put your company’s confidential data into a stranger’s web site.
XML vs. JSON
The advantages of each format stem from their unique traits.
- XML’s special talents
- Giving items names
- Comments
- Item IDs
- JSON’s special talents
- Not giving items names
- Lists
XML’s special talents
Giving items names
In data with a lot of internal variability, giving items names can make XML easy on human eyes.
For example, here’s what the first two levels of nested items in the XML behind the definition of my Salesforce “Contacts” table looks like right now (only it’s 7,000 lines long):
<CustomObject>
<actionOverrides>...(nested stuff)...</actionOverrides>
<actionOverrides>...(nested stuff)...</actionOverrides>
<actionOverrides>...(nested stuff)...</actionOverrides>
<actionOverrides>...(nested stuff)...</actionOverrides>
<compactLayoutAssignment>SYSTEM</compactLayoutAssignment>
<enableFeeds>true</enableFeeds>
<enableHistory>true</enableHistory>
<fieldSets>...(nested stuff)...</fieldSets>
<fieldSets>...(nested stuff)...</fieldSets>
<fields>...(nested stuff)...</fields>
<fields>...(nested stuff)...</fields>
<fields>...(nested stuff)...</fields>
<fields>...(nested stuff)...</fields>
<fields>...(nested stuff)...</fields>
<listViews>...(nested stuff)...</listViews>
<searchLayouts>...(nested stuff)...</searchLayouts>
<sharingModel>ReadWrite</sharingModel>
<webLinks>...(nested stuff)...</webLinks>
</CustomObject>
Particularly when items at a given level are sorted in alphabetical order, there’s something innately understandable about this data.
I like the way English words dominate the structure, burying the details when I use software such as Notepad++ to “fold” the XML and view only one layer at a time.
I doubt it’d be so fast to determine which part of the structure controls the “custom fields” on this table in a sea of {}
s and []
s where item “names” are just additional key-value pairs buried alongside the rest of an item’s data.
Here, it’s obvious that to add a new field to Salesforce, I’ll need to add a new one of those <fields>...</fields>
thingies to the XML representing my table.
Comments
I forgot to cover it, but XML lets you use comments.
This XML:
<Dan Bday="Jan. 27">
<!--
This is a multi-line XML comment.
The computer won't read any of this.
-->
<food>wine</food>
<!--
Nifty: I can "comment out" pickles
so that the computer won't think that Dan likes pickles,
but it will still be in this document
in case I change my mind later.
See next line.
-->
<!--<food>pickles</food>-->
</Dan>
Is equivalent to this XML, as far as computers are concerned:
<Dan Bday="Jan. 27">
<food>wine</food>
</Dan>
Of course, it’s messy to leave “pickles” in your XML if Dan no longer likes pickles.
Now you have things in your text file that the human eye is going to see, like “pickles” and “nifty,” but that its brain has to remember to ignore.
Use comments sparingly and carefully, if at all.
Item IDs
I forgot to cover it, but when you use the second (“attributes”) approach to indicating them in XML, anything you call “ID
” (case-insensitive) must meet a special rule for your XML to be valid:
You may not use the same value twice, anywhere in the entire XML file.
This is extremely important for writing code that can “jump” to an item in an XML file.
For example, the HTML controlling this blog post includes the following element:
<h2 id="posts-in-this-series">
Posts In This Series
</h2>
Your web browser is coded to jump you directly to whichever piece of the HTML behind this article is ID’ed “posts-in-this-series
” if you append “#posts-in-this-series
” to the URL of this blog post.
That’s how I linked you to precise paragraphs of Marijn Haverbeke’s quotes about octopi when explaining JSON on objects.
https://eloquentjavascript.net/2nd_edition/04_data.html is a very large web page.
However, Marijn was very forward-thinking (thank you!) and embedded an element called “a
” at the beginning of each paragraph and gave it a unique ID, as below:
<p>
<a class="p_ident" id="p_FolCMJfte3" href="#p_FolCMJfte3"></a>
To briefly return to our tentacle model of variable bindings—property bindings are similar.
They <em>grasp</em> values, but other variables and properties might be holding onto those same values.
You may think of objects as octopuses with any number of tentacles, each of which has a name inscribed on it.
</p>
Therefore, to provide links to individual paragraphs of his book, all I had to do was use my web browser to inspect the source code of …/04_data.html, find “p_FolCMJfte3
” attached to the paragraph that interested me, and link to …/04_data.html#p_FolCMJfte3.
JSON’s special talents
Not giving items names
When I ask Salesforce Pardot’s API for JSON representing data in its “Prospects” table, the text Pardot sends me looks something like this:
[
{
"id": 01010101,
"address_one": "123 Sunny St"
},
{
"id": 02020202,
"address_one": "null"
}
]
I already know that the whole file is full of records from the “Prospect” table, so I like that my eyes and brain can lock in on English words representing data, not a bunch of repetitions of the word “prospect.”
Lists
Similarly to the benefits of “not giving items names,” JSON’s lists can reduce redundancy and store certain types of data very concisely.
This is particularly true when storing a list of simple values (e.g. plaintext, numbers).
[
"wine",
"pickles",
"cream cheese",
"herring",
"tabbouli"
]
JSON’s list format doesn’t always save space over XML, though. Listing back-to-back items could be shorter in either XML or JSON, depending on how long the items themselves are in each standard.
Real World Usage: APIs
Let’s be honest: you rarely get to choose how to write JSON or XML.
Usually, you’re trying to download and process data from some sort of cloud product’s “API” – or upload data into one.
In that case, what matters most to you is being able to recognize which kind of data the “API” has given you, or expects from you.
After all, it’s not like you can get on the phone and argue with a company about which format they should have picked!
Hint:
- “REST” APIs typically want to receive data from you as JSON.
- “SOAP” APIs typically want to receive data from you as XML.
- Which type of data they give you is a little more variable.
In many of my posts about APIs, I talk about writing Python code or other code that can talk to cloud databases this way.
Takeaways
- XML and JSON are standards for storing data as plain text that are quite similar, and that are optimized for storing “naturally list-shaped” data.
- However, they are sometimes used to store “naturally table-shaped” data when it’s desirable to avoid relying on line breaks to separate data.
- CSV is a standard for storing data as plain text that is optimized for storing “naturally table-shaped” data.
- XML vs. JSON: XML is great with items that need names, comments, and giving each item a unique ID.
- XML vs. JSON: JSON is great with items that don’t need names and values that can be listed.
- XML vs. JSON: It doesn’t really matter which is great for which. You usually use both of them with other people’s software and have to live with their choices.
Thanks for sticking with the whole tutorial. Hopefully you’ve learned:
- How to recognize “naturally table-shaped” vs. “naturally list-shaped” data, and how the difference between the two can impact attempts to convert data between CSV and XML/JSON formats.
- Details that will help you read and write XML and JSON files comfortably.
As always, I’m excited to hear about your successes, of which I’m sure you’ll have many.
Be sure to let me know if you do any awesome projects with XML or JSON.