Home > Article > Backend Development > Python Libraries: A Comprehensive Guide to Writing, Packaging, and Distributing
Python is a great programming language, but packaging is one of its weakest points. This is a well-known fact in society. The process of installing, importing, using, and creating packages has improved a lot over the years, but it still doesn't compare to newer languages like Go and Rust, which have learned a lot from the struggles of Python and other mature languages.
In this tutorial, you'll learn everything you need to know about writing, packaging, and distributing your own packages.
A Python library is a coherent collection of Python modules organized into Python packages. Generally speaking, this means that all modules are in the same directory and that directory is on the Python search path.
Let's quickly write a small Python 3 package and illustrate all these concepts.
Python 3 has an excellent Path
object, which is a huge improvement over Python 2's clumsy os.path
module. But it's missing one key feature - finding the path to the current script. This is important when you want to position access files relative to the current script.
In many cases, the script can be installed anywhere, so absolute paths cannot be used, and the working directory can be set to any value, so relative paths cannot be used. If you want to access files in a subdirectory or parent directory, you must be able to find out the current script directory.
Here's how to do it in Python:
import pathlib script_dir = pathlib.Path(__file__).parent.resolve()
To access a file named "file.txt" in the "data" subdirectory of the current script directory, you can use the following code: print(open(str(script_dir/' data/file.txt'). read())
Using the pathology package, you have a built-in script_dir method that you can use like this:
from pathology.Path import script_dir print(open(str(script_dir()/'data/file.txt').read())
Yes, it’s a bit difficult to pronounce. The pathology package is very simple. It derives its own Path class from pathlib's Path and adds a static script_dir() that always returns the path of the calling script.
Here is the implementation:
import pathlib import inspect class Path(type(pathlib.Path())): @staticmethod def script_dir(): print(inspect.stack()[1].filename) p = pathlib.Path(inspect.stack()[1].filename) return p.parent.resolve()
Due to the cross-platform implementation of pathlib.Path
, you can derive directly from it and must derive from a specific subclass (PosixPath
or WindowsPath
). script_dir
Parsing uses the inspection module to find the caller and its filename attributes.
Whenever you write something more than a one-off script, you should test it. The pathology module is no exception. Here are the tests using the standard unit testing framework:
import os import shutil from unittest import TestCase from pathology.path import Path class PathTest(TestCase): def test_script_dir(self): expected = os.path.abspath(os.path.dirname(__file__)) actual = str(Path.script_dir()) self.assertEqual(expected, actual) def test_file_access(self): script_dir = os.path.abspath(os.path.dirname(__file__)) subdir = os.path.join(script_dir, 'test_data') if Path(subdir).is_dir(): shutil.rmtree(subdir) os.makedirs(subdir) file_path = str(Path(subdir)/'file.txt') content = '123' open(file_path, 'w').write(content) test_path = Path.script_dir()/subdir/'file.txt' actual = open(str(test_path)).read() self.assertEqual(content, actual)
Python packages must be installed somewhere on the Python search path to be imported by Python modules. The Python search path is a list of directories and is always available in sys.path
. This is my current sys.path
:
>>> print('\n'.join(sys.path)) /Users/gigi.sayfan/miniconda3/envs/py3/lib/python36.zip /Users/gigi.sayfan/miniconda3/envs/py3/lib/python3.6 /Users/gigi.sayfan/miniconda3/envs/py3/lib/python3.6/lib-dynload /Users/gigi.sayfan/miniconda3/envs/py3/lib/python3.6/site-packages /Users/gigi.sayfan/miniconda3/envs/py3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg
Note that the first empty line of output represents the current directory, so you can import modules from the current working directory, no matter what it is. You can add or remove directories directly to sys.path.
You can also define a PYTHONPATH
environment variable, and there are some other ways to control it. The standard site-packages
is included by default, which is where you install packages via pip go.
Now that we have the code and tests, let's package it all into the appropriate library. Python provides an easy way through the setup module. You create a file named setup.py in the root directory of the package.
setup.py The file contains a lot of metadata information, such as author, license, maintainer, and other information about the package. This is in addition to the packages
entry, which uses the find_packages()
function imported from setuptools
to find subpackages.
This is the setup.py file of the pathology package:
from setuptools import setup, find_packages setup(name='pathology', version='0.1', url='https://github.com/the-gigi/pathology', license='MIT', author='Gigi Sayfan', author_email='the.gigi@gmail.com', description='Add static script_dir() method to Path', packages=find_packages(exclude=['tests']), long_description=open('README.md').read(), zip_safe=False)
A source distribution package refers to an archive file containing Python packages, modules, and other files used for package distribution (such as version 1, version 2, etc.). Once the file is distributed, end users can download and install it on their operating systems.
To create a source distribution package (sdist), run: python setup.py sdist
Let’s build a source code distribution:
$ python setup.py sdist running sdist running egg_info creating pathology.egg-info writing pathology.egg-info/PKG-INFO writing dependency_links to pathology.egg-info/dependency_links.txt writing top-level names to pathology.egg-info/top_level.txt writing manifest file 'pathology.egg-info/SOURCES.txt' reading manifest file 'pathology.egg-info/SOURCES.txt' writing manifest file 'pathology.egg-info/SOURCES.txt' warning: sdist: standard file not found: should have one of README, README.rst, README.txt running check creating pathology-0.1 creating pathology-0.1/pathology creating pathology-0.1/pathology.egg-info copying files to pathology-0.1... copying setup.py -> pathology-0.1 copying pathology/__init__.py -> pathology-0.1/pathology copying pathology/path.py -> pathology-0.1/pathology copying pathology.egg-info/PKG-INFO -> pathology-0.1/pathology.egg-info copying pathology.egg-info/SOURCES.txt -> pathology-0.1/pathology.egg-info copying pathology.egg-info/dependency_links.txt -> pathology-0.1/pathology.egg-info copying pathology.egg-info/not-zip-safe -> pathology-0.1/pathology.egg-info copying pathology.egg-info/top_level.txt -> pathology-0.1/pathology.egg-info Writing pathology-0.1/setup.cfg creating dist Creating tar archive removing 'pathology-0.1' (and everything under it)
This warning is because I used a non-standard README.md file. It's safe to ignore it. The above command will create an archive file in the default format for the current operating system. For Unix systems, a gzipped tar file will be generated in the dist directory:
$ ls -la dist total 8 drwxr-xr-x 3 gigi.sayfan gigi.sayfan 102 Apr 18 21:20 . drwxr-xr-x 12 gigi.sayfan gigi.sayfan 408 Apr 18 21:20 .. -rw-r--r-- 1 gigi.sayfan gigi.sayfan 1223 Apr 18 21:20 pathology-0.1.tar.gz
If you are using Windows, a zip file will be generated.
You can also specify other additional file formats using the format options as shown below.
python setup.py sdist --formats=gztar,zip
例如,上述命令将生成一个 gzip 压缩的 tarball 和一个 zip 文件。
可用的不同格式有:
zip
: .zip
gztar
: .tar.gz
bztar
: .tar.bz2
xztar
: .tar.xz
ztar
: .tar.Z
tar
: .tar
要创建一个名为“wheel”的二进制发行版,请运行: python setup.py bdist_wheel
这是一个二进制发行版:
$ python setup.py bdist_wheel running bdist_wheel running build running build_py creating build creating build/lib creating build/lib/pathology copying pathology/__init__.py -> build/lib/pathology copying pathology/path.py -> build/lib/pathology installing to build/bdist.macosx-10.7-x86_64/wheel running install running install_lib creating build/bdist.macosx-10.7-x86_64 creating build/bdist.macosx-10.7-x86_64/wheel creating build/bdist.macosx-10.7-x86_64/wheel/pathology copying build/lib/pathology/__init__.py -> build/bdist.macosx-10.7-x86_64/wheel/pathology copying build/lib/pathology/path.py -> build/bdist.macosx-10.7-x86_64/wheel/pathology running install_egg_info running egg_info writing pathology.egg-info/PKG-INFO writing dependency_links to pathology.egg-info/dependency_links.txt writing top-level names to pathology.egg-info/top_level.txt reading manifest file 'pathology.egg-info/SOURCES.txt' writing manifest file 'pathology.egg-info/SOURCES.txt' Copying pathology.egg-info to build/bdist.macosx-10.7-x86_64/wheel/pathology-0.1-py3.6.egg-info running install_scripts creating build/bdist.macosx-10.7-x86_64/wheel/pathology-0.1.dist-info/WHEEL
病理包仅包含纯Python模块,因此可以构建通用包。如果您的软件包包含 C 扩展,则必须为每个平台构建单独的轮子:
$ ls -la dist total 16 drwxr-xr-x 4 gigi.sayfan gigi.sayfan 136 Apr 18 21:24 . drwxr-xr-x 13 gigi.sayfan gigi.sayfan 442 Apr 18 21:24 .. -rw-r--r-- 1 gigi.sayfan gigi.sayfan 2695 Apr 18 21:24 pathology-0.1-py3-none-any.whl -rw-r--r-- 1 gigi.sayfan gigi.sayfan 1223 Apr 18 21:20 pathology-0.1.tar.gz
要更深入地了解打包 Python 库的主题,请查看如何编写您自己的 Python 包。
Python 有一个名为 PyPI(Python 包索引)的中央包存储库。 PyPI 可以轻松管理不同版本的包。例如,如果用户需要安装特定的软件包版本,pip 知道在哪里查找它。
当您使用 pip 安装 Python 包时,它将从 PyPI 下载该包(除非您指定不同的存储库)。为了分发我们的病理包,我们需要将其上传到 PyPI 并提供 PyPI 所需的一些额外元数据。步骤是:
确保您的操作系统中安装了最新版本的 pip。要升级 pip,请发出以下命令
python3 -m pip install --upgrade pip
您可以在 PyPI 网站上创建帐户。然后在您的主目录中创建一个 .pypirc 文件:
[distutils] index-servers=pypi [pypi] repository = https://pypi.python.org/pypi username = the_gigi
出于测试目的,您可以将 pypitest
索引服务器添加到您的 .pypirc 文件中:
[distutils] index-servers= pypi pypitest [pypitest] repository = https://testpypi.python.org/pypi username = the_gigi [pypi] repository = https://pypi.python.org/pypi username = the_gigi
如果这是您的软件包的第一个版本,您需要使用 PyPI 注册它。使用setup.py的注册命令。它会询问您的密码。请注意,我将其指向此处的测试存储库:
$ python setup.py register -r pypitest running register running egg_info writing pathology.egg-info/PKG-INFO writing dependency_links to pathology.egg-info/dependency_links.txt writing top-level names to pathology.egg-info/top_level.txt reading manifest file 'pathology.egg-info/SOURCES.txt' writing manifest file 'pathology.egg-info/SOURCES.txt' running check Password: Registering pathology to https://testpypi.python.org/pypi Server response (200): OK
现在包已注册,我们可以上传它了。我建议使用麻线,这样更安全。像往常一样使用 pip install twine
安装它。然后使用 twine 上传您的包并提供您的密码(在下面进行编辑):
$ twine upload -r pypitest -p <redacted> dist/* Uploading distributions to https://testpypi.python.org/pypi Uploading pathology-0.1-py3-none-any.whl [================================] 5679/5679 - 00:00:02 Uploading pathology-0.1.tar.gz [================================] 4185/4185 - 00:00:01
该软件包现已在 PyPI 官方网站上提供,如下所示。
要使用 pip 安装它,只需发出以下命令:
pip install pathology
要更深入地了解分发包的主题,请查看如何共享您的 Python 包。
在本教程中,我们完成了编写 Python 库、打包并通过 PyPI 分发它的完整过程。此时,您应该拥有编写库并与世界共享库的所有工具。
本文已根据 Esther Vaati 的贡献进行了更新。 Esther 是 Envato Tuts+ 的软件开发人员和作家。
The above is the detailed content of Python Libraries: A Comprehensive Guide to Writing, Packaging, and Distributing. For more information, please follow other related articles on the PHP Chinese website!