

How to improve Python packaging, or why fourteen tools are at least twelve too m...
source link: https://chriswarrick.com/blog/2023/01/15/how-to-improve-python-packaging/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Contents
The plethora of tools
There are many packaging-related tools in Python. All of them with different authors, lineages, and often different opinions, although most of them are now unified under the Python Packaging Authority (PyPA) umbrella. Let’s take a look at them.
The classic stack
The classic Python packaging stack consists of many semi-related tools. Setuptools, probably the oldest tool of the group, and itself based on distutils
, which is part of the standard library (although it will be removed in Python 3.12), is responsible for installing a single package. It previously used setup.py
files to do its job, which required arbitrary code execution. It then added support for non-executable metadata specification formats: setup.cfg
, and also pyproject.toml
(partially still in beta). However, you aren’t supposed to use setup.py
files directly these days, you’re supposed to be using pip. Pip installs packages, usually from the PyPI, but it can also support other sources (such as git repositories or the local filesystem). But where does pip install things? The default used to be to install globally and system-wide, which meant you could introduce conflicts between packages installed by pip and apt (or whatever the system package manager is). Even with a user-wide install (which pip is likely to attempt these days), you can still end up with conflicts, and you can also have conflicts in which package A requests X version 1.0.0, but package B expects X version 2.0.0—but A and B are not at all related and could live separately with their preferred version of X. Enter venv
, a standard library descendant of virtualenv
, which can create a lightweight virtual environment for packages to live in. This virtual environment gives you the separation from system packages and from different environments, but it is still tied to the system Python in some ways (and if the system Python disappears, the virtual environment stops working).
A few extra tools would be used in a typical packaging workflow. The wheel
package enhances Setuptools with the ability to generate wheels, which are ready-to-install (without running setup.py
). Wheels can either be pure-Python and be installed anywhere, or they can contain pre-compiled extension modules (things written in C) for a given OS and Python (and there’s even a standard that allows building and distributing one wheel for all typical Linux distros). The wheel
package should be an implementation detail, something existing inside Setuptools and/or pip, but users need to be aware of it if they want to make wheels on their system, because virtual environments produced by venv
do not have wheel
installed. Regular users who do not maintain their own packages may sometimes be told that pip is using something legacy because wheel
is not installed, which is not a good user experience. Package authors also need twine
, whose sole task is uploading source distributions or wheels, created with other tools, to PyPI (and there’s not much more to say about that tool).
…and a few extensions
Over the years, there have been a few tools that are based on things from the classic stack. For example, pip-tools
can simplify dependency management. While pip freeze
lets you produce a file with everything installed in your environment, there is no way to specify the dependencies you need, and get a lock file with specific versions and transitive dependencies (without installing and freezing everything), there is no easy way to skip development dependencies (e.g. IPython) when you pip freeze
, and there is no workflow to update all your dependencies with just pip. pip-tools
adds two tools, pip-compile
which takes in requirements.in
files with the packages you care about, and produces a requrirements.txt
with pinned versions of them and all transitive dependencies; and also pip-sync
, which can install requirements.txt
and removes things not listed in it.
Another tool that might come in useful is virtualenvwrapper
, which can help you manage (create and activate) virtual environments in a central location. It has a few bells and whistles (such as custom hooks to do actions on every virtualenv creation), although for basic usage, you could replace it with a single-line shell function.
Yet another tool that works alongside the classic toolset is pipx
, which creates and manages virtual environments for apps written in Python. You tell it to pipx install Nikola
, and it will create a virtual environment somewhere, install Nikola into it, and put a script for launching it in ~/.local/bin
. While you could do it all yourself with venv and some symlinks, pipx can take care of this, and you don’t need to remember where the virtual environment is.
The scientific stack and conda
The scientific Python community have had their own tools for many years. The conda tool can manage environments and packages. It doesn’t use PyPI and wheels, but rather packages from conda channels (which are prebuilt, and expect an Anaconda-distributed Python). Back in the day, when there were no wheels, this was the easiest way to get things installed on Windows; this is not as much of a problem now with binary wheels on PyPI—but the Anaconda stack is still popular in the scientific world. Conda packages can be built with conda-build
, which is separate, but closely related to conda
itself. Conda packages are not compatible with pip
in any way, they do not follow the packaging standards used by other tools. Is this good? No, because it makes integrating the two worlds harder, but also yes, because many problems that apply to scientific packages (and their C/C++ extension modules, and their high-performance numeric libraries, and other things) do not apply to other uses of Python, so having a separate tool lets people focusing the other uses simplify their workflows.
The new tools
A few years ago, new packaging tools appeared. Now, there were lots of “new fancy tools” introduced in the past, with setuptools extending distutils, then distribute forking setuptools, then distribute being merged back…
The earliest “new tool” was Pipenv. Pipenv had really terrible and misleading marketing, and it merged pip and venv, in that Pipenv would create a venv and install packages in it (from Pipfile
or Pipfile.lock
). Pipenv can place the venv in the project folder, or hide it somewhere in the project folder (the latter is the default). However, Pipenv does not handle any packages related to packaging your code, so it’s useful only for developing non-installable applications (Django sites, for example). If you’re a library developer, you need setuptools anyway.
The second new tool was Poetry. It manages environments and dependencies in a similar way to Pipenv, but it can also build .whl
files with your code, and it can upload wheels and source distributions to PyPI. This means it has pretty much all the features the other tools have, except you need just one tool. However, Poetry is opinionated, and its opinions are sometimes incompatible with the rest of the packaging scene. Poetry uses the pyproject.toml
standard, but it does not follow the standard specifying how metadata should be represented in a pyproject.toml
file (PEP 621), instead using a custom [tool.poetry]
table. This is partly because Poetry came out before PEP 621, but the PEP was accepted over 2 years ago—the biggest compatibility problem is Poetry’s node-inspired ~
and ^
dependency version markers, which are not compatible with PEP 508 (the dependency specification standard). Poetry can package C extension modules, although it uses setuptools’ infrastructure for this (and requires a custom build.py
script).
Another similar tool is Hatch. This tool can also manage environments (it allows multiple environments per project, but it does not allow to put them in the project directory), and it can manage packages (but without lockfile support). Hatch can also be used to package a project (with PEP 621-compliant pyproject.toml
files) and upload it to PyPI. It does not support C extension modules.
A tool that tries to be a simpler re-imagining of Setuptools is Flit. It can build and install a package using a pyproject.toml
file. It also supports uploads to PyPI. It lacks support for C extension modules, and it expects you to manage environments on your own.
There’s one more interesting (albeit not popular or well-known) tool. This tool is PDM. It can manage venvs (but it defaults to the saner .venv
location), manage dependencies, and it uses a standards-compliant pyproject.toml
. There’s also a curious little feature called PEP 582 support, which we’ll talk about later.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK