Introduction
The essential features when learning quickly Python for an immediate use :
- Using Python virtual environments (Published : January, 2020)
- Reading, writing JSON files, handling JSON data with Python (Published : April, 2020)
- Handling Python programs arguments with the packages argparse et getopt (Published : April, 2020)
- Application configuration : environment variables, ini and YAML files (Published : April, 2020)
- Managing HTTP requests using the packages requests and httplib2 (Published : April, 2020)
In this chapter, how to read and write JSON data in a Python program with the system package json
.
The Python environment
The python environment is the following, sourced with the file $HOME/.python-3.8
:
$HOME/.python-3.8
#!/bin/bash export PYHOME=/opt/python/python-3.8 export PATH=$PYHOME/bin:$PATH export LD_LIBRARY_PATH=$PYHOME/lib:$LD_LIBRARY_PATH export PYTHONPATH=/opt/python/packages sqlpac@vpsfrsqlpac2$ . $HOME/.python-3.8 sqlpac@vpsfrsqlpac2$ which python3 sqlpac@vpsfrsqlpac2$ which pip3
/opt/python/python-3.8/bin/python3 /opt/python/python-3.8/bin/pip3
virtualenv
is installed and a full isolated virtual environment is setup for the project :
sqlpac@vpsfrsqlpac2$ cd /home/sqlpac sqlpac@vpsfrsqlpac2$ virtualenv /home/sqlpac/google
Using base prefix '/opt/python/python-3.8' New python executable in /home/sqlpac/google/bin/python3.8 Also creating executable in /home/sqlpac/google/bin/python Installing setuptools, pip, wheel... done.
sqlpac@vpsfrsqlpac2$ source /home/sqlpac/google/bin/activate
(google) sqlpac@vpsfrsqlpac2:/home/sqlpac$
The JSON data sample
We know the JSON format sent by Google Indexing API when requesting the status for a given URL :
{
"url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html",
"latestUpdate":
{ "url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html",
"type": "URL_UPDATED",
"notifyTime": "2020-04-10T17:43:21.198591915Z"
}
}
The environment variable $PRJ
is set to the working directory
(google) sqlpac@vpsfrsqlpac2:/home/sqlpac$ mkdir google/json (google) sqlpac@vpsfrsqlpac2:/home/sqlpac$ export PRJ=/home/sqlpac/google/json (google) sqlpac@vpsfrsqlpac2:/home/sqlpac$ cd $PRJ
(google) sqlpac@vpsfrsqlpac2:/home/sqlpac/google/json$
Let’s see how to handle JSON in a Python program.
System package json
The system package json
is available in native code, just import it :
$PRJ/handling-json.py
import json
That’s all !
Reading JSON data
The method loads : loading from a string variable
Use the method loads
to load JSON from a string variable :
import json
response_json='{"a":1, "b":2}'
loaded_json = json.loads(response_json)
for key in loaded_json:
print("key : %s, value: %s" % (key,loaded_json[key]))
(google) sqlpac@vpsfrsqlpac2:/home/sqlpac/google/json$ python3 handling-json.py
key : a, value: 1 key : b, value: 2
In the real life, JSON is not defined in a single line string, to define a JSON string using multiple lines :
import json
response_json = '''{
"url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html",
"latestUpdate":
{ "url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html",
"type": "URL_UPDATED",
"notifyTime": "2020-04-10T17:43:21.198591915Z"
},
"isactive" : true,
"floatvalue" : 1.2399,
"intvalue" : 1,
"ostypes" : ["linux","macos","windows"]
}'''
loaded_json = json.loads(response_json)
for key in loaded_json:
print("%s %s %s" % (key, type(loaded_json[key]), loaded_json[key]))
Extra data are added in the JSON sample for the demo : isactive
, floatvalue
, intvalue
, ostypes
.
Data types are also displayed with the function type()
. The data types are then the following :
Key | Type | Valeur |
---|---|---|
url | <class 'str'> | https://www.sqlpac.com/ref… |
latestUpdate | <class 'dict'> | {'url': 'https://www.sqlpac.com/…', 'type': 'URL_UPDATED'…} |
isactive | <class 'bool'> | True |
floatvalue | <class 'float'> | 1.2300 |
intvalue | <class 'int'> | 1 |
ostypes | <class 'list'> | ["linux","macos","windows"] |
When we are used to Javascript, the data type translation is the following :
Javascript | Python | |
---|---|---|
Object | dict | |
Array | list | |
String | str | |
Number (int) | int | |
Number (float) | float | |
true | false | True | False |
print(loaded_json["url"])
https://www.sqlpac.com/referentiel…
So naturally, we try the Javascript dot notation syntax, but it does not work :
print(loaded_json.url)
Traceback (most recent call last): File "handling-json.py", line 23, in <module> print(loaded_json.url) AttributeError: 'dict' object has no attribute 'url'
To use the dot notation, a class must be created :
import json response_json = '''{ "url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html", "latestUpdate": { "url": "https://www.sqlpac.com/referentiel/docs/mariadb-columnstore-1.2.3-installation-standalone-ubuntu.html", "type": "URL_UPDATED", "notifyTime": "2020-04-10T17:43:21.198591915Z" }, "isactive" : true, "floatvalue" : 1.2399, "intvalue" : 1, "ostypes" : ["linux","macos","windows"] }''' class google(): def __init__(self, data): self.__dict__ = json.loads(data) google_answer = google(response_json) print(google_answer.url) print(google_answer.latestUpdate["type"])
https://www.sqlpac.com/referentiel… URL_UPDATED
As expected, google_answer.latestUpdate.type
is not available as it would be with Javascript, but google_answer.latestUpdate["type"]
.
Python is not Javascript, we must leave sometimes our programming habits.
The method load : loading from a file
When JSON data are stored in a file, the method load
is used :
import json with open('json-data.json', 'r') as f: json_dict = json.load(f) print(json_dict["url"])
https://www.sqlpac.com/referentiel…
No difference with the previous example, to use dot notation, create a class :
import json class google(): def __init__(self, filename): with open(filename, 'r') as f: self.__dict__ = json.load(f) google_answer = google('json-data.json') print(google_answer.url)
https://www.sqlpac.com/referentiel…
Handling malformed JSON data
Use try / except
blocks to manage exceptions encountered when loading malformed JSON data :
import json with open('json-data.json') as f: try: data = json.load(f) except Exception as e: print("Exception raised | %s " % str(e)) exit() print(data["url"])
Exception raised | Expecting ',' delimiter: line 6 column 5 (char 306)
Duplicate key/value
What if a key/value is defined more than once :
{
"url": "1.html",
"url": 1
}
No exception raised, the value loaded and the datatype is the last key/value read in the JSON data :
import json … print("value : %s, data type : %s" % (data["url"], type(data["url"]) ))
value : 18, data type : <class 'int'>
Returning and writing JSON data
Let’s imagine we want to return the following "dummy" answer :
{
"url": "https://www.sqlpac.com/archives/2020",
"ostypes": [ "linux", "macos","windows"],
"isactive": true,
"price": "12€",
"details": {
"returncode": "0",
"reason": "none"
}
}
The method dumps
The method dumps
returns a JSON string from a Python dictionary :
import json response = {} response["url"] = "https://www.sqlpac.com/archives/2020" response["ostypes"] = ["linux","macos","windows"] response["isactive"] = True response["price"] = "12$" response["details"] = { "returncode": 1, "reason":"none" } str_response = json.dumps(response) print(str_response)
{"url": "https://www.sqlpac.com/archives/2020", "ostypes": ["linux", "macos", "windows"], "isactive": true, "price": "12$", "details": {"returncode": 1, "reason": "none"}}
Data are well transtyped in the way back :
Python | Javascript | |
---|---|---|
dict | Object | |
list | Array | |
str | String | |
int | Number (int) | |
float | Number (float) | |
True | False | true | false |
Human readable
Data are returned in a single line format, use the indentation option indent
to get a more human readable format
str_response = json.dumps(response, indent=4)
print(str_response)
{
"url": "https://www.sqlpac.com/archives/2020",
"ostypes": [
"linux",
"macos",
"windows"
],
"isactive": true,
"price": "12$",
"details": {
"returncode": 1,
"reason": "none"
}
}
Unicode
And if there is a unicode character, for example 12€ instead of 12$. The response will look like this :
…
"price": "12\u20ac",
…
By default, json.dumps
ensures text is ASCII-encoded, if not, text is escaped. Set the option
ensure_ascii
to False
to ensure unicode characters are not touched :
str_response = json.dumps(response, indent=4, ensure_ascii=False)
print(str_response)
…
"price": "12€",
…
Sorting keys
Key order is not guaranteed or predefined, to force a key ordering, set sort_keys
to True
:
str_response = json.dumps(response, indent=4, ensure_ascii=False, sort_keys=True)
print(str_response)
{
"details": {
"reason": "none",
"returncode": 1
},
"isactive": true,
"ostypes": [
"linux",
"macos",
"windows"
],
"price": "12€",
"url": "https://www.sqlpac.com/archives/2020"
}
The method dump
Use the method dump
when writing JSON data to a file, all the options described above with the method dumps
are available for the method dump
:
with open('response.json', 'w') as f:
json.dump(response,f,indent=4, ensure_ascii=False, sort_keys=False )
$PRJ/response.json
{
"url": "https://www.sqlpac.com/archives/2020",
"ostypes": [
"linux",
"macos",
"windows"
],
"isactive": true,
"price": "12€",
"details": {
"returncode": 1,
"reason": "none"
}
}
Conclusion
Serializing and deserializing data for JSON usage are quite easy but we need to forget Javascript habits when handling loaded JSON data in Python programs (dot notation…).