Since its inception, JSON has quickly become the de facto standard for information exchange. Chances are you’re here because you need to transport some data from here to there. Perhaps you’re gathering information through an API or storing your data in a document database. One way or another, you’re up to your neck in JSON, and you’ve got to Python your way out. Show
Luckily, this is a pretty common task, and—as with most common tasks—Python makes it almost disgustingly easy. Have no fear, fellow Pythoneers and Pythonistas. This one’s gonna be a breeze!
Free PDF Download: Python 3 Cheat Sheet A (Very) Brief History of JSONNot so surprisingly, JavaScript Object Notation was inspired by a subset of the JavaScript programming language dealing with object literal syntax. They’ve got a nifty website that explains the whole thing. Don’t worry though: JSON has long since become language agnostic and exists as its own standard, so we can thankfully avoid JavaScript for the sake of this discussion. Ultimately, the community at large adopted JSON because it’s easy for both humans and machines to create and understand. Remove adsLook, it’s JSON!Get ready. I’m about to show you some real life JSON—just like you’d see out there in the wild. It’s okay: JSON is supposed to be readable by anyone who’s used a C-style language, and Python is a C-style language…so that’s you!
As you can see, JSON supports primitive types, like strings and numbers, as well as nested lists and objects.
Whew! You survived your first encounter with some wild JSON. Now you just need to learn how to tame it. Python Supports JSON Natively!Python comes with a built-in package called 1 for encoding and decoding JSON data.Just throw this little guy up at the top of your file:
A Little VocabularyThe process of encoding JSON is usually called serialization. This term refers to the transformation of data into a series of bytes (hence serial) to be stored or transmitted across a network. You may also hear the term marshaling, but that’s a whole other discussion. Naturally, deserialization is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.
Serializing JSONWhat happens after a computer processes lots of information? It needs to take a data dump. Accordingly, the 1 library exposes the 3 method for writing data to files. There is also a 4 method (pronounced as “dump-s”) for writing to a Python string.Simple Python objects are translated to JSON according to a fairly intuitive conversion. PythonJSON 5 6 7, 8 9 0 1 2, 3, 4 5 6 7 8 9 0 1A Simple Serialization ExampleImagine you’re working with a Python object in memory that looks a little something like this:
It is critical that you save this information to disk, so your mission is to write it to a file. Using Python’s context manager, you can create a file called 2 and open it in write mode. (JSON files conveniently end in a 3 extension.)
Note that 3 takes two positional arguments: (1) the data object to be serialized, and (2) the file-like object to which the bytes will be written.Or, if you were so inclined as to continue using this serialized JSON data in your program, you could write it to a native Python 0 object.
Notice that the file-like object is absent since you aren’t actually writing to disk. Other than that, 4 is just like 3.Hooray! You’ve birthed some baby JSON, and you’re ready to release it out into the wild to grow big and strong. Remove adsSome Useful Keyword ArgumentsRemember, JSON is meant to be easily readable by humans, but readable syntax isn’t enough if it’s all squished together. Plus you’ve probably got a different programming style than me, and it might be easier for you to read code when it’s formatted to your liking.
The first option most people want to change is whitespace. You can use the 0 keyword argument to specify the indentation size for nested structures. Check out the difference for yourself by using 1, which we defined above, and running the following commands in a console:>>>
Another formatting option is the 2 keyword argument. By default, this is a 2-tuple of the separator strings 3, but a common alternative for compact JSON is 4. Take a look at the sample JSON again to see where these separators come into play.There are others, like 5, but I have no idea what that one does. You can find a whole list in the if you’re curious.Deserializing JSONGreat, looks like you’ve captured yourself some wild JSON! Now it’s time to whip it into shape. In the 1 library, you’ll find 7 and 8 for turning JSON encoded data into Python objects.Just like serialization, there is a simple conversion table for deserialization, though you can probably guess what it looks like already. JSONPython 6 5 9 7 1 0 5 (int) 2 5 (real) 4 7 6 9 8 1 0Technically, this conversion isn’t a perfect inverse to the serialization table. That basically means that if you encode an object now and then decode it again later, you may not get exactly the same object back. I imagine it’s a bit like teleportation: break my molecules down over here and put them back together over there. Am I still the same person? In reality, it’s probably more like getting one friend to translate something into Japanese and another friend to translate it back into English. Regardless, the simplest example would be encoding a 8 and getting back a 7 after decoding, like so:>>>
A Simple Deserialization ExampleThis time, imagine you’ve got some data stored on disk that you’d like to manipulate in memory. You’ll still use the context manager, but this time you’ll open up the existing 2 in read mode.
Things are pretty straightforward here, but keep in mind that the result of this method could return any of the allowed data types from the conversion table. This is only important if you’re loading in data you haven’t seen before. In most cases, the root object will be a 5 or a 7.If you’ve pulled JSON data in from another program or have otherwise obtained a string of JSON formatted data in Python, you can easily deserialize that with 8, which naturally loads from a string:
Voilà! You’ve tamed the wild JSON, and now it’s under your control. But what you do with that power is up to you. You could feed it, nurture it, and even teach it tricks. It’s not that I don’t trust you…but keep it on a leash, okay? Remove adsA Real World Example (sort of)For your introductory example, you’ll use JSONPlaceholder, a great source of fake JSON data for practice purposes. First create a script file called 1, or whatever you want. I can’t really stop you.You’ll need to make an API request to the JSONPlaceholder service, so just use the 2 package to do the heavy lifting. Add these imports at the top of your file:
Now, you’re going to be working with a list of TODOs cuz like…you know, it’s a rite of passage or whatever. Go ahead and make a request to the JSONPlaceholder API for the 3 endpoint. If you’re unfamiliar with 2, there’s actually a handy 5 method that will do all of the work for you, but you can practice using the 1 library to deserialize the 7 attribute of the response object. It should look something like this: 0You don’t believe this works? Fine, run the file in interactive mode and test it for yourself. While you’re at it, check the type of 8. If you’re feeling adventurous, take a peek at the first 10 or so items in the list.>>> 1See, I wouldn’t lie to you, but I’m glad you’re a skeptic.
All right, time for some action. You can see the structure of the data by visiting the endpoint in a browser, but here’s a sample TODO: 2There are multiple users, each with a unique 00, and each task has a Boolean 01 property. Can you determine which users have completed the most tasks? 3Yeah, yeah, your implementation is better, but the point is, you can now manipulate the JSON data as a normal Python object! I don’t know about you, but when I run the script interactively again, I get the following results: >>> 4That’s cool and all, but you’re here to learn about JSON. For your final task, you’ll create a JSON file that contains the completed TODOs for each of the users who completed the maximum number of TODOs. All you need to do is filter 8 and write the resulting list to a file. For the sake of originality, you can call the output file 03. There are many ways you could go about this, but here’s one: 5Perfect, you’ve gotten rid of all the data you don’t need and saved the good stuff to a brand new file! Run the script again and check out 03 to verify everything worked. It’ll be in the same directory as 1 when you run it.Now that you’ve made it this far, I bet you’re feeling like some pretty hot stuff, right? Don’t get cocky: humility is a virtue. I am inclined to agree with you though. So far, it’s been smooth sailing, but you might want to batten down the hatches for this last leg of the journey. Remove adsEncoding and Decoding Custom Python ObjectsWhat happens when we try to serialize the 06 class from that Dungeons & Dragons app you’re working on? 6Not so surprisingly, Python complains that 06 isn’t serializable (which you’d know if you’ve ever tried to tell an Elf otherwise):>>> 7Although the 1 module can handle most built-in Python types, it doesn’t understand how to encode customized data types by default. It’s like trying to fit a square peg in a round hole—you need a buzzsaw and parental supervision.Simplifying Data StructuresNow, the question is how to deal with more complex data structures. Well, you could try to encode and decode the JSON by hand, but there’s a slightly more clever solution that’ll save you some work. Instead of going straight from the custom data type to JSON, you can throw in an intermediary step. All you need to do is represent your data in terms of the built-in types 1 already understands. Essentially, you translate the more complex object into a simpler representation, which the 1 module then translates into JSON. It’s like the transitive property in mathematics: if A = B and B = C, then A = C.To get the hang of this, you’ll need a complex object to play with. You could use any custom class you like, but Python has a built-in type called 11 for representing complex numbers, and it isn’t serializable by default. So, for the sake of these examples, your complex object is going to be a 11 object. Confused yet?>>> 8
A good question to ask yourself when working with custom types is What is the minimum amount of information necessary to recreate this object? In the case of complex numbers, you only need to know the real and imaginary parts, both of which you can access as attributes on the 11 object:>>> 9Passing the same numbers into a 11 constructor is enough to satisfy the 15 comparison operator:>>> 0Breaking custom data types down into their essential components is critical to both the serialization and deserialization processes. Encoding Custom TypesTo translate a custom object into JSON, all you need to do is provide an encoding function to the 3 method’s 17 parameter. The 1 module will call this function on any objects that aren’t natively serializable. Here’s a simple decoding function you can use for practice: 1Notice that you’re expected to raise a 19 if you don’t get the kind of object you were expecting. This way, you avoid accidentally serializing any Elves. Now you can try encoding complex objects for yourself!>>> 2
The other common approach is to subclass the standard 21 and override its 22 method: 3Instead of raising the 19 yourself, you can simply let the base class handle it. You can use this either directly in the 3 method via the 25 parameter or by creating an instance of the encoder and calling its 26 method:>>> 4Remove adsDecoding Custom TypesWhile the real and imaginary parts of a complex number are absolutely necessary, they are actually not quite sufficient to recreate the object. This is what happens when you try encoding a complex number with the 27 and then decoding the result:>>> 5All you get back is a list, and you’d have to pass the values into a 11 constructor if you wanted that complex object again. Recall our discussion about . What’s missing is metadata, or information about the type of data you’re encoding.I suppose the question you really ought ask yourself is What is the minimum amount of information that is both necessary and sufficient to recreate this object? The 1 module expects all custom types to be expressed as 30 in the JSON standard. For variety, you can create a JSON file this time called 31 and add the following 6 representing a complex number: 6See the clever bit? That 33 key is the metadata we just talked about. It doesn’t really matter what the associated value is. To get this little hack to work, all you need to do is verify that the key exists: 7If 33 isn’t in the dictionary, you can just return the object and let the default decoder deal with it.Every time the 7 method attempts to parse an 6, you are given the opportunity to intercede before the default decoder has its way with the data. You can do this by passing your decoding function to the 37 parameter.Now play the same kind of game as before: >>> 8While 37 might feel like the counterpart to the 3 method’s 17 parameter, the analogy really begins and ends there.This doesn’t just work with one object either. Try putting this list of complex numbers into 31 and running the script again: 9If all goes well, you’ll get a list of 11 objects:>>> 0You could also try subclassing 43 and overriding 37, but it’s better to stick with the lightweight solution whenever possible.All done!Congratulations, you can now wield the mighty power of JSON for any and all of your nefarious Python needs. While the examples you’ve worked with here are certainly contrived and overly simplistic, they illustrate a workflow you can apply to more general tasks:
What you do with your data once it’s been loaded into memory will depend on your use case. Generally, your goal will be gathering data from a source, extracting useful information, and passing that information along or keeping a record of it. Today you took a journey: you captured and tamed some wild JSON, and you made it back in time for supper! As an added bonus, learning the 1 package will make learning 51 and 52 a snap.Good luck with all of your future Pythonic endeavors! Mark as Completed Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Working With JSON Data in Python 🐍 Python Tricks 💌 Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team. Send Me Python Tricks » About Lucas Lofaro Lucas is a wandering Pythoneer with a curious mind and a desire to spread knowledge to those who seek it. » More about LucasEach tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are: Aldren Dan Joanna Master Real-World Python Skills With Unlimited Access to Real Python Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas: Level Up Your Python Skills » Master Real-World Python Skills Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas: Level Up Your Python Skills » What Do You Think? Rate this article: Tweet Share Share EmailWhat’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know. Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. and get answers to common questions in our support portal. |