Introduction
The essential features when learning quickly Python for an immediate use :
- Using Python virtual environments (Published : January, 2020)
- Reading, writing JSON files, handling JSON data with Python (Published : April, 2020)
- Handling Python programs arguments with the packages argparse et getopt (Published : April, 2020)
- Application configuration : environment variables, ini and YAML files (Published : April, 2020)
- Managing HTTP requests using the packages requests and httplib2 (Published : April, 2020)
Obviously, no hard coded application configuration values in programs, Python programs or not.
Configuration can be retrieved from :
- Environment variables
- Ini files
- JSON files
- XML files
- YAML files
In this chapter, how to read (write) configuration data with Python from environment variables, INI files with configparser
and YAML files with the package PyYAML
.
XML is not covered here. XML format is less used nowadays, JSON and YAML have more human readable formats, further more XML parsers are a little bit heavy.
JSON format is also not covered in this paper, a dedicated article is published on this topic : Python, Reading and writing JSON with the package json
Environment variables, module os
sqlpac@vpsfrsqlpac2$ export CFG=/home/sqlpac/cfg sqlpac@vpsfrsqlpac2$ echo $CFG
/home/sqlpac/cfg
To read environment variables, import the module os
and call the method getenv
or
the method get
of the class environ
:
import os confdir = os.getenv('CFG') homedir = os.environ.get('HOME') print(confdir) print(homedir)
/home/sqlpac/cfg /home/sqlpac
When the environment variable does not exist : the method getenv
and environ.get
return None
.
The environment variable can be retrieved using directly the syntax os.environ["Environment Variable"]
without using the get
methods.
import os
confdir = os.environ['CFG']
But the exception KeyError
must be managed in the case of the environment variable does not exist instead of
testing if None
is returned when using the methods get
import os varenv = 'CFG2' try: confdir = os.environ[varenv] except KeyError: print('Environment variable %s does not exist' % (varenv))
Environment variable CFG2 does not exist
The most important : how to set an environment variable available in the sub programs (shell, Python…) ?
Just define the environment variable os.environ['var']
in the parent program,
the environment variable is then available to sub programs.
subprogram.py
|
build.bash
|
import os os.environ['VERSION']='4.2' os.system('python3 subprogram.py') os.system('./build.bash')
Python subprogram.py : 4.2 Shell ./build.bash : 4.2
Ini files, module configparser
Reading an INI file
A sample ini file (nested sections can not be implemented) :
sqlpac.ini
[sqlpac]
version=5.8
verbosity=2
debug=false
user=sqlpac
wwwurl=https://www.sqlpac.com/
rpc=https://www.sqlpac.com/rpc-secure/
[referential]
dir=https://www.sqlpac.com/referentiel/docs
[googleindexing]
apikey=ApIkeYdGF_kBtPVdAwIM7F0Fu87qWMoykfyl9hfnG2
jsonfile=google-auth-indexing.json
scopes=https://www.googleapis.com/auth/indexing
notification=https://indexing.googleapis.com/v3/urlNotifications/metadata?url=
publish=https://indexing.googleapis.com/v3/urlNotifications:publish
[mobiletest]
serviceurl=https://searchconsole.googleapis.com/v1/urlTestingTools/mobileFriendlyTest:run
Use the package configparser
to read an INI file.
An object is created with configparser.ConfigParser()
and its method read
is called with the ini
file path in argument :
import configparser
cfg = configparser.ConfigParser()
cfg.read('sqlpac.ini')
The method sections
return the sections in a list
object :
import configparser cfg = configparser.ConfigParser() cfg.read('sqlpac.ini') print(cfg.sections())
['sqlpac', 'referential', 'googleindexing', 'mobiletest']
Variables are retrieved with the usual syntax :
print(cfg['sqlpac']['debug']) print(cfg['sqlpac']['version']) for key in cfg['googleindexing']: print(cfg['googleindexing'][key])
false 5.8 ApIkeYdGF_kBtPVdAwIM7F0Fu87qWMoykfyl9hfnG2 google-auth-indexing.json https://www.googleapis.com/auth/indexing https://indexing.googleapis.com/v3/urlNotifications/metadata?url= https://indexing.googleapis.com/v3/urlNotifications:publish
Config
object does not guess data types, string
datatype is applied. To convert to the right data types,
use the appropriate get
method :
version = cfg['sqlpac'].getfloat('version')
verbosity = cfg['sqlpac'].getint('verbosity')
debug = cfg['sqlpac'].getboolean('debug')
As with a dictionary, use the methods get
to provide fallback values when the key does not exist :
debug = cfg['sqlpac'].getboolean('debug', False)
Using variables, interpolations
To avoid redundancy, variables can be used in INI files :
[sqlpac]
wwwurl=https://www.sqlpac.com
rpc=%(wwwurl)s/rpc-secure
By default, the basic interpolation is activated in ConfigParser
. %(var)s
is evaluated on
demand where var
is defined in the same section, there is no need to define and use the variables in a specific order.
print(cfg['sqlpac']['rpc'])
https://www.sqlpac.com/rpc-secure
Basic interpolation evaluates variables for directives in the same section. When evaluation is needed cross sections,
extended interpolation must be defined in the config parser object. In extended interpolations, variables
have the nomenclature ${section:var}
and when the section is missing, the section of the variable is used.
[sqlpac]
wwwurl=https://www.sqlpac.com
rpc=${wwwurl}/rpc-secure
[referential]
dir=${sqlpac:wwwurl}/referentiel/docs
import configparser from configparser import ExtendedInterpolation cfg = configparser.ConfigParser(interpolation=ExtendedInterpolation()) cfg.read('sqlpac.ini') print(cfg['sqlpac']['rpc']) print(cfg['referential']['dir'])
https://www.sqlpac.com/rpc-secure https://www.sqlpac.com/referentiel/docs
Basic and extended interpolations are mutually exclusive. Both can not be used.
With respectively the basic and extended interpolation, just double the character %
and $
to escape when used in configuration directives values, otherwise they are candidates
to interpolation.
notif=50%% done # % added to bypass basic interpolation
price=$$10 # $ added to bypass extended interpolation
Delimiters, comments
The defaults about delimiters and comments are the followings :
delimiters=('=', ':')
comment_prefixes=('#', ';')
The first occurence in a line is considered as the delimiter or the comment marker in that line.
Obviously, this can be overriden when creating the config parser object :
cfg = configparser.ConfigParser(interpolation=ExtendedInterpolation(),
delimiters=('=', ':', '~'),
comment_prefixes=('#', ';' ,'@'))
Writing an INI file
Less used, but good to know, to write an INI file from a dictionary object, use the method write
:
import configparser
config = configparser.ConfigParser()
config['sqlpac'] = {}
config['sqlpac']['wwwurl'] = 'https://www.sqlpac.com'
config['sqlpac']['rpc'] = '${wwwurl}/rpc-secure'
with open('sqlpac2.ini', 'w') as cfgfile:
config.write(cfgfile)
sqlpac2.ini
[sqlpac]
wwwurl = https://www.sqlpac.com
rpc = ${wwwurl}/rpc-secure
Another coding using the methods add_section
and set
:
import configparser
config = configparser.ConfigParser()
config.add_section('sqlpac');
config.set('sqlpac','wwwurl','https://www.sqlpac.com')
config.set('sqlpac','rpc','${wwwurl}/rpc-secure')
with open('sqlpac2.ini', 'w') as cfgfile:
config.write(cfgfile)
YAML
YAML : Ain’t Markup Language. Even more human readable than JSON format.
Translating the INI file to YAML format
Let’s write the previous sqlpac.ini
file in YAML format :
sqlpac.yaml
sqlpac:
version: 5.8
verbosity: 2
debug: false
user: sqlpac
wwwurl: https://www.sqlpac.com
rpc: https://www.sqlpac.com/rpc-secure
referential:
dir: https://www.sqlpac.com/referentiel/docs
googleindexing:
apikey: ApIkeYdGF_kBtPVdAwIM7F0Fu87qWMoykfyl9hfnG2
jsonfile: google-auth-indexing.json
scopes: https://www.googleapis.com/auth/indexing
rooturl: https://indexing.googleapis.com/v3
endpoints:
notification: https://indexing.googleapis.com/v3/urlNotifications/metadata?url=
publish: https://indexing.googleapis.com/v3/urlNotifications:publish
mobiletest:
serviceurl: https://searchconsole.googleapis.com/v1/urlTestingTools/mobileFriendlyTest:run
YAML is very interesting, we can introduce the subsection endpoints
, this was
not possible in the INI file.
But the bad news : we can not use variables as we did before in the ini file and the config parser interpolation interfaces.
In the YAML specifications, defining variables is not possible. Anchors can be defined but it is only used to duplicate values, concatenation is forbidden, the syntax below raises an error :
sqlpac:
wwwurl: &url https://www.sqlpac.com
rpc: *url/rpc-secure
Installing pyYAML
The YAML parser is not native in Python, an optional package must be installed. If not installed, install the package PyYAML :
pip3 search PyYAML
PyYAML (5.3.1) - YAML parser and emitter for Python
pip3 install PyYAML
Successfully built PyYAML Installing collected packages: PyYAML Successfully installed PyYAML-5.3.1
Loading YAML files
To load a YAML file, import the module yaml
and call the method load
:
import yaml with open("sqlpac.yaml", "r") as ymlfile: cfg = yaml.load(ymlfile, Loader=yaml.FullLoader) print(cfg["googleindexing"]["endpoints"]) print(cfg["googleindexing"]["endpoints"]["notification"])
{'notification': 'https://indexing.googleapis.com/v3/urlNotifications/metadata?url=', 'publish': 'https://indexing.googleapis.com/v3/urlNotifications:publish'} https://indexing.googleapis.com/v3/urlNotifications/metadata?url=
Since version 5.1, the load
method must be called with the Loader
option, otherwise a warning
is raised :
YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe.
Please read https://msg.pyyaml.org/load for full details.
Valid Values for the Loader
option are :
BaseLoader
: only loads the most basic YAML.SafeLoader
: loads a subset of the YAML language, safely. This is recommended for loading untrusted input.FullLoader
: loads the full YAML language. Avoids arbitrary code execution.UnsafeLoader
: the original Loader code that could be easily exploitable by untrusted data input.
What about data types ? The appropriate data type is applied when loading, no conversion is needed compared to INI files and
configparser
:
import yaml with open("sqlpac.yaml", "r") as ymlfile: cfg = yaml.load(ymlfile, Loader=yaml.FullLoader) for key in ('version','verbosity','debug'): print('%s : %s, %s' % (key, cfg["sqlpac"][key], type(cfg["sqlpac"][key])))
version : 5.8, <class 'float'> verbosity : 2, <class 'int'> debug : False, <class 'bool'>
The data type can be enforced in the YAML file, for example if we want the directive version
as a string
datatype and not float
:
sqlpac: version: !!str 5.8 …
version : 5.8, <class 'str'>
Writing YAML files
To write a YAML file, build a dictionary and use the dump
method :
import yaml
cfgyaml = {}
cfgyaml["sqlpac"] = {}
cfgyaml["sqlpac"]["user"] = "sqlpac"
cfgyaml["sqlpac"]["wwwurl"] = "https://www.sqlpac.com"
cfgyaml["google"] = {}
cfgyaml["google"]["apis"] = ["googleindexing", "googleanalytics"]
with open("sqlpac2.yaml", "w") as f:
yaml.dump(cfgyaml, f, sort_keys=False)
sqlpac2.yaml
sqlpac:
user: sqlpac
wwwurl: https://www.sqlpac.com
google:
apis:
- googleindexing
- googleanalytics
The sort_keys
option in the dump
method is only available starting PyYAML version 5.1 released in March 2019,
by default keys are ordered.
Conclusion : INI or YAML for the configuration file ?
When lists, nested dictionaries are intensively used in the configuration file : YAML is suitable, but remember that variables are the not possible.
Further more, YAML is independent of a programming language if it must be exchanged with other platforms.
INI file and configparser
is the best choice when variables are needed, but conversions must be managed in this
context and the INI file is then platform and Python language dependent.