Home  >  Article  >  Backend Development  >  Super complete! Common ways to write configuration files in Python

Super complete! Common ways to write configuration files in Python

PHPz
PHPzforward
2023-04-13 08:31:051299browse

Super complete! Common ways to write configuration files in Python

Why write configuration files

During the development process, we often use some fixed parameters or constants. For these more fixed and commonly used parts, they are often written into a fixed file to avoid repetition in different module codes and keep the core code clean.

We can write this fixed file directly into a .py file, such as settings.py or config.py. The advantage of this is that we can directly import parts of it through import in the same project; but if we need to When sharing configuration files on other non-Python platforms, writing a single .py is not a good choice.

At this time we should choose a common configuration file type to store these fixed parts. Currently, the commonly used and popular configuration file format types mainly include ini, json, toml, yaml, xml, etc. We can parse these types of configuration files through standard libraries or third-party libraries.

ini

ini means Initialize. In the early days, it was the storage format of configuration files on Windows. The writing method of ini files is easy to understand and is often relatively simple. It usually consists of section, key and value, like the following form:

[localdb]
host = 127.0.0.1
user = root
password = 123456
port = 3306
database = mysql

Python's own built-in configparser standard library, We can directly use it to parse the ini file. For example, we save the above content in a file named db.ini, then use the read() method to parse and read, and finally use the items() method to obtain all key-value pairs under the specified node.

>>> from configparser import ConfigParser
>>> cfg = ConfigParser()
>>> cfg.read("/Users/Bobot/db.ini")
['/Users/Bobot/db.ini']
>>> cfg.items("localdb")
[('host', '127.0.0.1'), ('user', 'root'), ('password', '123456'), ('port', '3306'), ('database', 'mysql')]

It should be noted that configparser presents the value in the form of a string by default, so this is why we do not add quotation marks in the db.ini file but directly write the literal above.

After obtaining the key-value pair, I actually converted it directly into a dictionary, and then unpacked the parameters to keep the code simple:

#!pip install pymysql
import pymysql
from configparser import ConfigParser
cfg = ConfigParser()
cfg.read("/Users/Bobot/db.ini")
db_cfg = dict(cfg.items("localdb"))
con = pymysql.connect(**db_cfg)

json

## The #json format can be said to be a common file format for us, and it is also a popular data exchange format on the Internet. In addition, json is sometimes also a type of configuration file.

For example, npm (a JavaScript package management tool similar to Python's pip) and the widely used VSCode editor produced by Microsoft all use json to write configuration parameters.

Like configparser, Python also has a built-in json standard library, which can import file and string json content through the load() and loads() methods.

{
 "localdb":{
 "host": "127.0.0.1",
 "user": "root",
 "password": "123456",
 "port": 3306,
 "database": "mysql"
 }
}

We save the above content as db.json and then read and parse it. It is relatively simple and easy for the json library to read json files, and it is easy to parse into Python dictionary objects.

>>> import json
>>> from pprint import pprint
>>>
>>> with open('/Users/Bobot/db.json') as j:
... cfg = json.load(j)['localdb']
...
>>> pprint(cfg)
{'database': 'mysql',
'host': '127.0.0.1',
'password': '123456',
'port': 3306,
'user': 'root'}

The disadvantage of using json file configuration is that the syntax standards are strictly limited. One of the criticisms is that comments cannot be written in it, unless other supersets of the json type are used as alternatives (comments can be written in VSCode json parameter configuration file is one alternative); at the same time, there is the problem of too deep nesting, which can easily lead to errors. It should not be used to write long or complex parameter configuration information.

toml

The toml format (or tml format) is a configuration file format proposed by Github co-founder Tom Preston-Werner. According to Wikipedia, toml was first proposed in July 2013, seven years ago; it is also somewhat similar to the yaml file to be discussed later in some aspects, but if you know the yaml When the specification has dozens of pages (yes, it's really dozens of pages...), you may not really be willing to write such a complicated configuration file, so the toml format is a good choice.

The format of toml is roughly as follows:

Super complete! Common ways to write configuration files in Python

01-toml style

It can be seen from here that toml is somewhat similar to the ini mentioned earlier document. But it extends much more than ini.

We can see in the sample picture that in addition to basic strings, timestamps, Boolean values, arrays, etc. are further supported, and the style is very similar to Python's native writing.

Of course, I won’t introduce too many specifications of toml format here. Someone has already translated the official specification document. Interested friends can check it directly.

Developers have created corresponding "wheels" for configuration file types that fit the Python method. Currently, the uiri/toml version has the most stars on Github, but this version has only passed v0.5 version toml specification, but it is quite simple to use. We can install it through the pip command

pip install toml

The parsing method of this library is very simple, and it is somewhat similar to the parsing usage of the json library, that is, through load() or loads() for parsing; similarly, conversion and export are also used similarly.

For example, we now write the following content into config.toml:

[mysql]
host = "127.0.0.1"
user = "root"
port = 3306
database = "test"
 [mysql.parameters]
 pool_size = 5
 charset = "utf8"
 [mysql.fields]
 pandas_cols = [ "id", "name", "age", "date"]

紧接着我们就可以通过 toml 库中的 load() 方法来进行读取:

>>> import toml
>>> import os
>>> from pprint import pprint
>>> cfg = toml.load(os.path.expanduser("~/Desktop/config.toml"))
>>> pprint(cfg)
{'mysql': {'database': 'test',
'fields': {'pandas_cols': ['id', 'name', 'age', 'date']},
'host': '127.0.0.1',
'parameters': {'charset': 'utf8', 'pool_size': 5},
'port': 3306,
'user': 'root'}}

可以看到 toml 文件被间接地转化成了字典类型,当然这也就是 json 版的写法(将单引号替换成双引号即可),方便我们后续调用或者传参。

yaml

yaml 格式(或 yml 格式)是目前较为流行的一种配置文件,它早在 2001 由一个名为 Clark Evans 的人提出;同时它也是目前被广泛使用的配置文件类型,典型的就是 Docker 容器里的 docker-compose.yml 配置文件,如果经常使用 Docker 进行部署的人对此不会陌生。

yaml 文件的设计从 Python、XML 等地方获取灵感,所以在使用时能很清楚地看到这些部分的影子。

在上一节 toml 内容里我曾提到,yaml 的规范内容可以说是冗长和复杂,足足有80页之多(斗尊强者,恐怖如斯……)。

Super complete! Common ways to write configuration files in Python

02-yaml规范页数

所以感兴趣的朋友可以再自行了解相关用法。

YAML 官方早已经提供了相应的 Python 库进行支持,即 PyYAML;当然也同样需要我们事先进行安装:

pip install pyyaml

同 json 库和 toml 库一样,通过 load() 方法来进行加载。

需要注意的是,使用 load() 方法会存在一定的安全隐患,从思科 Talos 的这份报告中我们可以看到,如果加载了未知或不信任的 yaml 文件,那么有可能会存在被攻击的风险和网络安全隐患,因为它能够直接调用相应的 Python 函数来执行为攻击者所需要的命令,比如说在 yaml 文件中写入这么一段:

# 使用Linux和macOS的朋友不要轻易尝试
!!python/object/apply:os.system ["rm -rf /"]

因此最好是使用 safe_load() 来代替 load() 方法。

这和 Python 内置的 string 标准库中 Template 类的 substitute() 模板方法一样存在着同样的安全隐患,所以使用 safe_substitute() 来替代是一样的道理。

如我们现在将之前的一些配置信息写入 config.yaml 文件中:

mysql:
 host: "127.0.0.1"
 port: 3306
 user: "root"
 password: "123456"
 database: "test"
 parameter:
 pool_size: 5
 charset: "utf8"
 fields:
pandas_cols:
 - id
 - name
 - age
 - date

然后我们通过 safe_load() 方法进行解析:

>>> import os
>>> from pprint import pprint
>>>
>>> with open(os.path.expanduser("~/config.yaml"), "r") as config:
... cfg = yaml.safe_load(config)
...
>>> pprint(cfg)
{'mysql': {'database': 'test',
'fields': {'pandas_cols': ['id', 'name', 'age', 'date']},
'host': '127.0.0.1',
'parameter': {'charset': 'utf8', 'pool_size': 5},
'password': '123456',
'port': 3306,
'user': 'root'}}

可以看到最后结果和前面的 toml 库的解析结果基本一致。

结尾

本文列举了一些主流且常见的配置文件类型及其 Python 的读取方法,可能有的读者会发现当中没有 xml 格式类型的内容。对于 xml 配置文件可能与 Java 系语言打交道的朋友遇见得会多一些,但 xml 文件的可读性实在是让人望而生畏;对 xml 文件不了解的朋友可以使用 Chrome 浏览器随便进入一个网站然后按下 F12 进入开发者后查看那密密麻麻的 html 元素便是 .xml 的缩影。

除了这些主流的配置文件类型之外,像一些 .cfg、.properties 等都可以作为配置文件,甚至和开头提到的那样,你单独用一个 .py 文件来书写各类配置信息作为配置文件进行导入都是没问题,只是在跨语言共享时可能会有些障碍。因此本文就不过多介绍,感兴趣的朋友可以进一步自行了解。

在本文里列举的配置文件类型其复杂性由上到下依次增加:ini

The above is the detailed content of Super complete! Common ways to write configuration files in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete