Français
Presentations About Resources

Salesforce, Python, SQL, & other ways to put your data where you need it -- a bilingual blog in English & French

Intro to XML and JSON #6: JSON

26 Apr 2019 🔖 xml json tutorials csv excel
💬 EN

Post Header Image

Table of Contents

We’ve seen JSON from a “30,000-foot view.”

We understand what kind of data JSON can help us with.

Let’s learn how to write JSON!

(And let’s talk a lot about octopi. 🐙 Thanks, Marijn Haverbeke.)


Warning: I’m afraid JSON might be getting the short end of the stick by coming after XML.

XML and JSON are both great standards for writing data shaped like “lists of lists,” and both deserve to be introduced as they are, not as a “compare and contrast” to the other.

I do want you to leave this series capable of comparing, contrasting, and translating between the two.

However, I fear that I will make JSON seem far more complicated than it is by introducing it as “compared to XML,” because I’m going to have to spend a lot of time talking about XML things you can’t do (often before covering JSON things you can do).

Please know that JSON is typically considered “simpler” to read and write than XML. To see explanations of JSON that introduce it in its own right, without defining it in comparison to XML, visit the links in Additional Resources at the bottom of this article.

Finally, as with XML, we’ll talk a lot about items, keys, and values, so click here if you need a refresher before starting today’s lesson.


Posts In This Series


Viewing “Pretty” XML & JSON

We’ll have a lot of examples in this series. I recommend that you edit them and play with seeing them in a “pretty” format! XML, JSON – paste & click “Tree View”.

Warning: only put sample data into the “beautifier” links above. Never put your company’s confidential data into a stranger’s web site.


Items in JSON: the “Object”

The punctuation that JSON uses to define the beginning and end of an “object” is a set of “curly braces.” It looks like this:

{}

The fact that the pair of curly braces exists in your text file means that it exists as a conceptual “item” in your data.


Empty Elements

It doesn’t matter whether or not there’s anything typed between the curly braces.

The “object” is still a conceptual item that “exists” in your data, simply because you bothered to create a pair of curly braces to represent it.

If it doesn’t have anything between the braces, you can think of it a little like a row full of nothing but commas in a CSV file.

It’s still there!

It’s just blank.


A uniquely JSON concept: the “List”

It’s hard to explain exactly what a JSON “list” is (a.k.a. an “array”), when thinking about “items,” “keys,” and “values,” because it’s not quite any of them.

It certainly doesn’t appear in XML or CSV.

And yet, there’s a very clear punctuation standard for writing a JSON list.

The punctuation that JSON uses to define the beginning and end of a “list” is a set of “square brackets.” It looks like this:

[]

As with empty objects, it’s valid to include an empty list in your data set. (Whether you think it’s meaningful to do so is a business question you’ll have to answer.)

I’m going to go into more detail later about how and when to take advantage of “lists” in JSON, so don’t get too caught up in this yet.

Nevertheless, I’d like to show you a quick preview of how lists are used in JSON.

Remember our “birthday ideas list” from the second article?

I would probably put Ridhi’s “birthday” & “collection” key-value pairs into a JSON object (Ridhi being the “item”) like this:

{
	"Bday" : "Sep. 16",
	"Collection" : "frogs"
}

See how every value (e.g. “frogs”) inside an object (e.g. this representation of Ridhi) is labeled by a unique key (e.g. “Collection”)?

Like the “attributes” that can go inside the opening tag of an XML element, JSON key-value pairs within an “object” have have unique keys.

That leaves us in a bit of a pickle (no pun intended) when it comes to describing Dan’s two favorite foods: wine & pickles.

What are we going to do? Call them “food 1” & “food 2?”

{
	"food 1" : "wine",
	"food 2" : "pickles"
}

Yuck. That’s almost as ugly as trying to make a CSV file out of our “friends list” was.

Instead, we can use a JSON list to contain all of the values that we might want to label “food”:

[
	"wine",
	"pickles"
]

See how nice it is not to have to come up with a unique key name for each food Dan likes?

That’s what JSON lists are for!

The entire object representing Dan could look like this:

{
	"Bday" : "Jan. 27",
	"Food" :
		[
			"wine",
			"pickles"
		]
}

I like the way Marijn Haverbeke explains lists in a book called Eloquent JavaScript (images also from the book):

Marijn on objects:

“You may think of objects as octopuses with any number of tentacles, each of which has a name tattooed on it.”

Marijn on lists (a.k.a. “arrays”):

“Arrays, then, are just a kind of object specialized for storing sequences of things. … You can see them as long, flat octopuses with all their tentacles in a neat row, labeled with numbers.”

Therefore, in our data, “Dan” is one octpus whose “food” tentacle is holding another octopus. The inner-octopus has implied numeric labels on its tentacles, and its tentacles are holding things like “wine” and “pickles.”


“Keys” & their “Values”

Indicating “key-value pairs” to describe a JSON object (or “item“) is a bit like the second (“attributes”) approach to indicating them in XML.

  • As with XML attributes, you are not allowed to use a key more than once to describe any given item (although you are free to reuse it again elsewhere).
  • However, unlike XML attributes, in JSON you are allowed to nest things inside the values of these key-value pairs. In fact, in JSON, it’s highly encouraged!

The punctuation for JSON key-value pairs is also slightly different than it is with XML attributes:

  1. A key-value pair is connected in JSON with a colon (“:”).
    (In XML, an equals sign.)
  2. The “key” to a JSON attribute is supposed to be in quotes.
    (In XML, it shouldn’t be.)
  3. The “value” to a JSON attribute does not have to be in quotes.
    (In XML, the “values” of “attributes” have to be in quotes.)
    • You only include quotes around a JSON “value” to indicate that it’s “definitely text.”
      5 is the number 5, but '5' is the textual representation of that number.
    • Numbers and the special keyword null can be more easily interpreted as special values by JSON-reading tools if you leave off the quotes (although be careful not to include commas in the numbers if you’re not putting them in quotes!).
    • A JSON attribute’s value can even be another JSON object or a JSON list.
      (We’ll get to that in a minute – it’s pretty powerful.)
  4. You separate key-value pairs from each other with commas in JSON.
    (In XML, you use whitespace.)

Here again is our “Ridhi” object, with its two key-value pairs:

{
	"Bday" : "Sep. 16",
	"Collection" : "frogs"
}

Whereas in XML’s “attributes” approach to listing key-value pairs, Ridhi might have looked like this:

<Ridhi Bday="Sep. 16" Collection="frogs"/>

Nesting

Because JSON allows us to be more flexible with the contents of “values” than XML’s attributes notation allows, we can do this with Dan in JSON:

{
	"Bday" : "Jan. 27",
	"Food" :
		[
			"wine",
			"pickles"
		]
}

This would have been invalid XML:

<Dan Bday="Jan. 27" Food=<wine/><pickles/> />

Instead, we would have had to use XML’s first approach for indicating key-value pairs (“items within items”) to represent Dan’s food preferences:

<Dan Bday="Jan. 27">
	<food>wine</food>
	<food>pickles</food>
</Dan>

XML vs. JSON = different structures

As you can see, the structure of JSON is noticeably different from XML.

Even though XML and JSON are quite similar, and even though they’re designed to represent the same shape of data, their slight differences mean that you make pretty different structural decisions about how to write them.

Such differences might even impact your choice of XML vs. JSON.

These differences also mean that when you’re reading someone else’s data, if they provide it to you in both XML and JSON, you should be prepared for the data to be structured differently between the two files they provide you – even if the two files represent the same real-world data.


JSON = BYON (“bring your own names”)

Have you noticed yet that JSON “objects” don’t have names?

I keep saying “the Ridhi object” or “the Ridhi octopus,” but there’s nothing in it that actually says “Ridhi!”

{
	"Bday" : "Sep. 16",
	"Collection" : "frogs"
}

XML elements and JSON objects both represent the concept of an “item,” but XML elements have names and JSON objects don’t.

If you want a JSON object named, you’ll have to do it yourself.

For example, something that you could represent in XML like this:

<Shirt Color="" Fabric=""></Shirt>

Might have to become this in JSON:

{
 "type" : "Shirt",
 "Color" : null,
 "Fabric" : null
}

See how we took advantage of the fact that the word “type” wasn’t otherwise needed as the name of a “key” and staked a claim on it?

All we’re doing to “name” JSON objects is making up a new key-value pair and adding it to the object (preferably as the first one, for human readability).

We could do something similar for Ridhi:

{
	"Name" : "Ridhi",
	"Bday" : "Sep. 16",
	"Collection" : "frogs"
}

Names? We don’t need no stinkin’ names!

Not all data needs names. Sometimes, you know what’s in your data.

JSON is more concise than XML when your data doesn’t need names.

  • Q: How can I tell when names are unnecessary?
  • A: As follows:
    • You have a bunch of conceptual “items” back-to-back at the same level of nesting
    • None of them need a “plain-text value” representing what they truly are
    • All of them just contain key-value pairs describing what they have

🚗 🚘 🚙 🚌 🏎️

Let’s go back to representing a fleet of cars.

Here’s some XML representing a fleet of cars, each of which have different sets of key-value traits we care about tracking.

<RootElement>
 <Car color="blue" trim="chrome" trunk="hatchback"/>
 <Car appeal="sporty" doors="2"/>
 <Car doors="4" color="red" make="Ford"/>
</RootElement>

Maybe we already know they’re all cars based on the context of our data.

Maybe it feels silly to call them each “Car.”

A nice JSON equivalent might be:

[
 {
  "color" : "blue",
  "trim" : "chrome",
  "trunk" : "hatchback"
 },
 {
  "appeal" : "sporty",
  "doors" : "2"
 },
 {
  "doors" : "4",
  "color" : "red",
  "make" : "Ford"
 }
]

Grammar subtlety: Back-to-back “objects”

You can’t simply list JSON objects back-to-back the way you can put XML elements back-to-back.

This isn’t valid JSON:

{}
{}

Instead, you have to put them inside square-brackets and separate them with commas (that is, you have to make them members of a JSON list).

(Tip: Remember not to put a comma after the last one, since nothing comes next. Messing up the commas is an easy copy/paste mistake when you’re editing JSON by hand.)

This is how you show 2 JSON objects at the same level as each other:

[
 {},
 {}
]

Note that the line breaks and tabs are for human benefit only. This is the same JSON, written more like you’ll find JSON you download from 3rd parties:

[{},{}]

Grammar subtlety: Roots

Just like the entirety of a piece of XML must be contained within a single element, the entirety of a piece of JSON must be contained within a single “octopus.”

However, you can choose either type of “octopus”:

  • An “object” {}
  • A “list” []

Examples

This is valid JSON – it has an “object” “outer octopus.”

{"name" : "Dan", "Bday" : "Jan. 27"}

This is not valid JSON – too many “outer octopi.”

{"name" : "Dan", "Bday" : "Jan. 27"}
{"name" : "Ridhi", "Bday" : "Sep. 16"}

This is valid JSON – it has a “list” “outer octopus.”

[
	{"name" : "Dan", "Bday" : "Jan. 27"},
	{"name" : "Ridhi", "Bday" : "Sep. 16"}
]

This is not valid JSON (although there are discussions underway to let it be considered a special exception to the “outer octopus” rule):

"wine"

This is valid JSON – it has a “list” “outer octopus.”

["wine"]

This is valid JSON – it has a “list” “outer octopus.”

["wine","pickles"]

This is valid JSON – it has an “object” “outer octopus.”

{"food" : "wine"}

This is not valid JSON – look carefully … no “outer octopus” at all.

"food" : "wine"

Lists, again

Note that the values in the list might be plaintext / numbers / null / another list / another item (“object“) / etc.

(Just like the “value” of a key-value pair in an “object” can be plaintext, numbers, null, a list, another object, etc.)


Takeaways

  1. Both JSON & XML need a single “root” item surrounding the entire contents of a dataset.
  2. JSON has “lists,” which requires thinking about “octopi” to understand the role they play in storing data.
    • Lists are not quite like items, keys, or values in the ways we’ve been referring to these concepts (although according to “octopus” theory, lists are merely a strange, special kind of item).
  3. JSON doesn’t name things as explicitly as XML names things.
    • Whether that’s good or bad depends on your business case.
  4. JSON lets you play with the use of quotation marks to indicate how you want values to be treated (as plaintext, as a number, etc.).

Additional Resources

--- ---