Notes on packaging XGBoost’s Python package

How to build binary wheels and source distributions

Wheels and source distributions (sdist for short) are the two main mechanisms for packaging and distributing Python packages.

  • A source distribution (sdist) is a tarball (.tar.gz extension) that contains the source code.

  • A wheel is a ZIP-compressed archive (with .whl extension) representing a built distribution. Unlike an sdist, a wheel can contain compiled components. The compiled components are compiled prior to distribution, making it more convenient for end-users to install a wheel. Wheels containing compiled components are referred to as binary wheels.

See Python Packaging User Guide to learn more about how Python packages in general are packaged and distributed.

For the remainder of this document, we will focus on packaging and distributing XGBoost.

Building sdists

In the case of XGBoost, an sdist contains both the Python code as well as the C++ code, so that the core part of XGBoost can be compiled into the shared libary [1].

You can obtain an sdist as follows:

$ python -m build --sdist .

(You’ll need to install the build package first: pip install build or conda install python-build.)

Running pip install with an sdist will launch CMake and a C++ compiler to compile the bundled C++ code into

$ pip install -v xgboost-2.0.0.tar.gz  # Add -v to show build progress

Building binary wheels

You can also build a wheel as follows:

$ pip wheel --no-deps -v .

Notably, the resulting wheel contains a copy of the shared library [1]. The wheel is a binary wheel, since it contains a compiled binary.

Running pip install with the binary wheel will extract the content of the wheel into the current Python environment. Since the wheel already contains a pre-built copy of, it does not have to be built at the time of install. So pip install with the binary wheel completes quickly:

$ pip install xgboost-2.0.0-py3-none-linux_x86_64.whl  # Completes quickly