Notes on packaging XGBoost’s Python package
Wheels and source distributions (sdist for short) are the two main mechanisms for packaging and distributing Python packages.
A source distribution (sdist) is a tarball (
.tar.gzextension) that contains the source code.
A wheel is a ZIP-compressed archive (with
.whlextension) representing a built distribution. Unlike an sdist, a wheel can contain compiled components. The compiled components are compiled prior to distribution, making it more convenient for end-users to install a wheel. Wheels containing compiled components are referred to as binary wheels.
See Python Packaging User Guide to learn more about how Python packages in general are packaged and distributed.
For the remainder of this document, we will focus on packaging and distributing XGBoost.
In the case of XGBoost, an sdist contains both the Python code as well as
the C++ code, so that the core part of XGBoost can be compiled into the
You can obtain an sdist as follows:
$ python -m build --sdist .
(You’ll need to install the
build package first:
pip install build or
conda install python-build.)
pip install with an sdist will launch CMake and a C++ compiler
to compile the bundled C++ code into
$ pip install -v xgboost-2.0.0.tar.gz # Add -v to show build progress
You can also build a wheel as follows:
$ pip wheel --no-deps -v .
Notably, the resulting wheel contains a copy of the shared library
libxgboost.so . The wheel is a binary wheel,
since it contains a compiled binary.
pip install with the binary wheel will extract the content of
the wheel into the current Python environment. Since the wheel already
contains a pre-built copy of
libxgboost.so, it does not have to be
built at the time of install. So
pip install with the binary wheel
$ pip install xgboost-2.0.0-py3-none-linux_x86_64.whl # Completes quickly