XGBoost C Package
XGBoost implements a set of C API designed for various bindings, we maintain its stability
and the CMake/make build interface. See C API Tutorial for an
introduction and demo/c-api/
for related examples. Also one can generate doxygen
document by providing -DBUILD_C_DOC=ON
as parameter to CMake
during build, or
simply look at function comments in include/xgboost/c_api.h
. The reference is exported
to sphinx with the help of breathe, which doesn’t contain links to examples but might be
easier to read. For the original doxygen pages please visit:
C API Reference
Library
- group Library
These functions are used to obtain general information about XGBoost including version, build info and current global configuration.
Typedefs
-
typedef void *DMatrixHandle
handle to DMatrix
-
typedef void *BoosterHandle
handle to Booster
Functions
-
void XGBoostVersion(int *major, int *minor, int *patch)
Return the version of the XGBoost library being currently used.
The output variable is only written if it’s not NULL.
- Parameters:
major – Store the major version number
minor – Store the minor version number
patch – Store the patch (revision) number
-
int XGBuildInfo(char const **out)
Get compile information of shared library.
- Parameters:
out – string encoded JSON object containing build flags and dependency version.
- Returns:
0 for success, -1 for failure
-
const char *XGBGetLastError()
get string message of the last error
all function in this file will return 0 when success and -1 when an error occurred, XGBGetLastError can be called to retrieve the error
this function is thread safe and can be called by different thread
- Returns:
const char* error information
-
int XGBRegisterLogCallback(void (*callback)(const char*))
register callback function for LOG(INFO) messages — helpful messages that are not errors. Note: this function can be called by multiple threads. The callback function will run on the thread that registered it
- Returns:
0 for success, -1 for failure
-
int XGBSetGlobalConfig(char const *config)
Set global configuration (collection of parameters that apply globally). This function accepts the list of key-value pairs representing the global-scope parameters to be configured. The list of key-value pairs are passed in as a JSON string.
- Parameters:
config – a JSON string representing the list of key-value pairs. The JSON object shall be flat: no value can be a JSON object or an array.
- Returns:
0 for success, -1 for failure
-
int XGBGetGlobalConfig(char const **out_config)
Get current global configuration (collection of parameters that apply globally).
- Parameters:
out_config – pointer to received returned global configuration, represented as a JSON string.
- Returns:
0 for success, -1 for failure
-
typedef void *DMatrixHandle
DMatrix
- group DMatrix
DMatrix is the basic data storage for XGBoost used by all XGBoost algorithms including both training, prediction and explanation. There are a few variants of
DMatrix
including normalDMatrix
, which is a CSR matrix,QuantileDMatrix
, which is used by histogram-based tree methods for saving memory, and lastly the experimental external-memory-based DMatrix, which reads data in batches during training. For the last two variants, see the Streaming group.Functions
-
int XGDMatrixCreateFromFile(const char *fname, int silent, DMatrixHandle *out)
load a data matrix
- Deprecated:
since 2.0.0
See also
- Parameters:
fname – the name of the file
silent – whether print messages during loading
out – a loaded data matrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromURI(char const *config, DMatrixHandle *out)
load a data matrix
uri: The URI of the input file. The URI parameter
format
is required when loading text data.See Text Input Format of DMatrix for more info.
silent (optional): Whether to print message during loading. Default to true.
data_split_mode (optional): Whether the file was split by row or column beforehand for distributed computing. Default to row.
- Parameters:
out – a loaded data matrix
config – JSON encoded parameters for DMatrix construction. Accepted fields are:
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCSREx(const size_t *indptr, const unsigned *indices, const float *data, size_t nindptr, size_t nelem, size_t num_col, DMatrixHandle *out)
create a matrix content from CSR format
- Deprecated:
since 2.0.0
See also
-
int XGDMatrixCreateFromColumnar(char const *data, char const *config, DMatrixHandle *out)
Create a DMatrix from columnar data. (table)
- Parameters:
data – See XGBoosterPredictFromColumnar for details.
config – See XGDMatrixCreateFromDense for details.
out – The created dmatrix.
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCSR(char const *indptr, char const *indices, char const *data, bst_ulong ncol, char const *config, DMatrixHandle *out)
Create a matrix from CSR matrix.
- Parameters:
indptr – JSON encoded array_interface to row pointers in CSR.
indices – JSON encoded array_interface to column indices in CSR.
data – JSON encoded array_interface to values in CSR.
ncol – Number of columns.
config – JSON encoded configuration. Required values are:
missing: Which value to represent missing value.
nthread (optional): Number of threads used for initializing DMatrix.
data_split_mode (optional): Whether the data was split by row or column beforehand. Default to row.
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromDense(char const *data, char const *config, DMatrixHandle *out)
Create a matrix from dense array.
- Parameters:
data – JSON encoded array_interface to array values.
config – JSON encoded configuration. Required values are:
missing: Which value to represent missing value.
nthread (optional): Number of threads used for initializing DMatrix.
data_split_mode (optional): Whether the data was split by row or column beforehand. Default to row.
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCSC(char const *indptr, char const *indices, char const *data, bst_ulong nrow, char const *config, DMatrixHandle *out)
Create a matrix from a CSC matrix.
- Parameters:
indptr – JSON encoded array_interface to column pointers in CSC.
indices – JSON encoded array_interface to row indices in CSC.
data – JSON encoded array_interface to values in CSC.
nrow – number of rows in the matrix.
config – JSON encoded configuration. Supported values are:
missing: Which value to represent missing value.
nthread (optional): Number of threads used for initializing DMatrix.
data_split_mode (optional): Whether the data was split by row or column beforehand. Default to row.
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCSCEx(const size_t *col_ptr, const unsigned *indices, const float *data, size_t nindptr, size_t nelem, size_t num_row, DMatrixHandle *out)
create a matrix content from CSC format
- Deprecated:
since 2.0.0
See also
-
int XGDMatrixCreateFromMat(const float *data, bst_ulong nrow, bst_ulong ncol, float missing, DMatrixHandle *out)
create matrix content from dense matrix
- Parameters:
data – pointer to the data space
nrow – number of rows
ncol – number columns
missing – which value to represent missing value
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromMat_omp(const float *data, bst_ulong nrow, bst_ulong ncol, float missing, DMatrixHandle *out, int nthread)
create matrix content from dense matrix
- Parameters:
data – pointer to the data space
nrow – number of rows
ncol – number columns
missing – which value to represent missing value
out – created dmatrix
nthread – number of threads (up to maximum cores available, if <=0 use all cores)
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromDT(void **data, const char **feature_stypes, bst_ulong nrow, bst_ulong ncol, DMatrixHandle *out, int nthread)
create matrix content from python data table
- Parameters:
data – pointer to pointer to column data
feature_stypes – pointer to strings
nrow – number of rows
ncol – number columns
out – created dmatrix
nthread – number of threads (up to maximum cores available, if <=0 use all cores)
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCudaColumnar(char const *data, char const *config, DMatrixHandle *out)
Create DMatrix from CUDA columnar format. (cuDF)
- Parameters:
data – Array of JSON encoded cuda_array_interface for each column.
config – JSON encoded configuration. Required values are:
missing: Which value to represent missing value.
nthread (optional): Number of threads used for initializing DMatrix.
data_split_mode (optional): Whether the data was split by row or column beforehand. Default to row.
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCudaArrayInterface(char const *data, char const *config, DMatrixHandle *out)
Create DMatrix from CUDA array.
- Parameters:
data – JSON encoded cuda_array_interface for array data.
config – JSON encoded configuration. Required values are:
missing: Which value to represent missing value.
nthread (optional): Number of threads used for initializing DMatrix.
data_split_mode (optional): Whether the data was split by row or column beforehand. Default to row.
out – created dmatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSliceDMatrix(DMatrixHandle handle, const int *idxset, bst_ulong len, DMatrixHandle *out)
create a new dmatrix from sliced content of existing matrix
- Parameters:
handle – instance of data matrix to be sliced
idxset – index set
len – length of index set
out – a sliced new matrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSliceDMatrixEx(DMatrixHandle handle, const int *idxset, bst_ulong len, DMatrixHandle *out, int allow_groups)
create a new dmatrix from sliced content of existing matrix
- Parameters:
handle – instance of data matrix to be sliced
idxset – index set
len – length of index set
out – a sliced new matrix
allow_groups – allow slicing of an array with groups
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixFree(DMatrixHandle handle)
free space in data matrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSaveBinary(DMatrixHandle handle, const char *fname, int silent)
load a data matrix into binary file
- Parameters:
handle – a instance of data matrix
fname – file name
silent – print statistics when saving
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSetInfoFromInterface(DMatrixHandle handle, char const *field, char const *c_interface_str)
Set content in array interface to a content in info.
- Parameters:
handle – a instance of data matrix
field – field name.
c_interface_str – JSON string representation of array interface.
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSetFloatInfo(DMatrixHandle handle, const char *field, const float *array, bst_ulong len)
set float vector to a content in info
- Parameters:
handle – a instance of data matrix
field – field name, can be label, weight
array – pointer to float vector
len – length of array
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSetUIntInfo(DMatrixHandle handle, const char *field, const unsigned *array, bst_ulong len)
- Deprecated:
since 2.1.0
Use XGDMatrixSetInfoFromInterface instead.
-
int XGDMatrixSetStrFeatureInfo(DMatrixHandle handle, const char *field, const char **features, const bst_ulong size)
Set string encoded information of all features.
Accepted fields are:
feature_name
feature_type
char const* feat_names [] {"feat_0", "feat_1"}; XGDMatrixSetStrFeatureInfo(handle, "feature_name", feat_names, 2); // i for integer, q for quantitive, c for categorical. Similarly "int" and "float" // are also recognized. char const* feat_types [] {"i", "q"}; XGDMatrixSetStrFeatureInfo(handle, "feature_type", feat_types, 2);
- Parameters:
handle – An instance of data matrix
field – Field name
features – Pointer to array of strings.
size – Size of
features
pointer (number of strings passed in).
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixGetStrFeatureInfo(DMatrixHandle handle, const char *field, bst_ulong *size, const char ***out_features)
Get string encoded information of all features.
Accepted fields are:
feature_name
feature_type
Caller is responsible for copying out the data, before next call to any API function of XGBoost.
char const **c_out_features = NULL; bst_ulong out_size = 0; // Asumming the feature names are already set by `XGDMatrixSetStrFeatureInfo`. XGDMatrixGetStrFeatureInfo(handle, "feature_name", &out_size, &c_out_features) for (bst_ulong i = 0; i < out_size; ++i) { // Here we are simply printing the string. Copy it out if the feature name is // useful after printing. printf("feature %lu: %s\n", i, c_out_features[i]); }
- Parameters:
handle – An instance of data matrix
field – Field name
size – Size of output pointer
features
(number of strings returned).out_features – Address of a pointer to array of strings. Result is stored in thread local memory.
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixSetDenseInfo(DMatrixHandle handle, const char *field, void const *data, bst_ulong size, int type)
- Deprecated:
since 2.1.0
Use XGDMatrixSetInfoFromInterface instead.
-
int XGDMatrixGetFloatInfo(const DMatrixHandle handle, const char *field, bst_ulong *out_len, const float **out_dptr)
get float info vector from matrix.
- Parameters:
handle – a instance of data matrix
field – field name
out_len – used to set result length
out_dptr – pointer to the result
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixGetUIntInfo(const DMatrixHandle handle, const char *field, bst_ulong *out_len, const unsigned **out_dptr)
get uint32 info vector from matrix
- Parameters:
handle – a instance of data matrix
field – field name
out_len – The length of the field.
out_dptr – pointer to the result
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixNumRow(DMatrixHandle handle, bst_ulong *out)
get number of rows.
- Parameters:
handle – the handle to the DMatrix
out – The address to hold number of rows.
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixNumCol(DMatrixHandle handle, bst_ulong *out)
get number of columns
- Parameters:
handle – the handle to the DMatrix
out – The output of number of columns
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixNumNonMissing(DMatrixHandle handle, bst_ulong *out)
Get number of valid values from DMatrix.
- Parameters:
handle – the handle to the DMatrix
out – The output of number of non-missing values
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixDataSplitMode(DMatrixHandle handle, bst_ulong *out)
Get the data split mode from DMatrix.
- Parameters:
handle – the handle to the DMatrix
out – The output of the data split mode
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixGetDataAsCSR(DMatrixHandle const handle, char const *config, bst_ulong *out_indptr, unsigned *out_indices, float *out_data)
Get the predictors from DMatrix as CSR matrix for testing. If this is a quantized DMatrix, quantized values are returned instead.
Unlike most of XGBoost C functions, caller of
XGDMatrixGetDataAsCSR
is required to allocate the memory for return buffer instead of using thread local memory from XGBoost. This is to avoid allocating a huge memory buffer that can not be freed until exiting the thread.- Since
1.7.0
- Parameters:
handle – the handle to the DMatrix
config – JSON configuration string. At the moment it should be an empty document, preserved for future use.
out_indptr – indptr of output CSR matrix.
out_indices – Column index of output CSR matrix.
out_data – Data value of CSR matrix.
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixGetQuantileCut(DMatrixHandle const handle, char const *config, char const **out_indptr, char const **out_data)
Export the quantile cuts used for training histogram-based models like
hist
andapprox
. Useful for model compression.- Since
2.0.0
- Parameters:
handle – the handle to the DMatrix
config – JSON configuration string. At the moment it should be an empty document, preserved for future use.
out_indptr – indptr of output CSC matrix represented by a JSON encoded __(cuda_)array_interface__.
out_data – Data value of CSC matrix represented by a JSON encoded __(cuda_)array_interface__.
-
int XGDMatrixCreateFromFile(const char *fname, int silent, DMatrixHandle *out)
Streaming
- group Streaming
Quantile DMatrix and external memory DMatrix can be created from batches of data.
There are 2 sets of data callbacks for DMatrix. The first one is currently exclusively used by JVM packages. It uses
XGBoostBatchCSR
to accept batches for CSR formated input, and concatenate them into 1 final big CSR. The related functions are:Another set is used by external data iterator. It accept foreign data iterators as callbacks. There are 2 different senarios where users might want to pass in callbacks instead of raw data. First it’s the Quantile DMatrix used by hist and GPU Hist. For this case, the data is first compressed by quantile sketching then merged. This is particular useful for distributed setting as it eliminates 2 copies of data. 1 by a
concat
from external library to make the data into a blob for normal DMatrix initialization, another by the internal CSR copy of DMatrix. The second use case is external memory support where users can pass a custom data iterator into XGBoost for loading data in batches. There are short notes on each of the use cases in respected DMatrix factory function.Related functions are:
Factory functions
XGDMatrixCreateFromCallback for external memory
XGQuantileDMatrixCreateFromCallback for quantile DMatrix
Proxy that callers can use to pass data to XGBoost
Typedefs
-
typedef void *DataIterHandle
handle to a external data iterator
-
typedef void *DataHolderHandle
handle to a internal data holder.
-
typedef int XGBCallbackSetData(DataHolderHandle handle, XGBoostBatchCSR batch)
Callback to set the data to handle,.
- Param handle:
The handle to the callback.
- Param batch:
The data content to be set.
-
typedef int XGBCallbackDataIterNext(DataIterHandle data_handle, XGBCallbackSetData *set_function, DataHolderHandle set_function_handle)
The data reading callback function. The iterator will be able to give subset of batch in the data.
If there is data, the function will call set_function to set the data.
- Param data_handle:
The handle to the callback.
- Param set_function:
The batch returned by the iterator
- Param set_function_handle:
The handle to be passed to set function.
- Return:
0 if we are reaching the end and batch is not returned.
-
typedef int XGDMatrixCallbackNext(DataIterHandle iter)
Callback function prototype for getting next batch of data.
- Param iter:
A handler to the user defined iterator.
- Return:
0 when success, -1 when failure happens
-
typedef void DataIterResetCallback(DataIterHandle handle)
Callback function prototype for resetting external iterator.
Functions
-
int XGDMatrixCreateFromDataIter(DataIterHandle data_handle, XGBCallbackDataIterNext *callback, const char *cache_info, float missing, DMatrixHandle *out)
Create a DMatrix from a data iterator.
- Parameters:
data_handle – The handle to the data.
callback – The callback to get the data.
cache_info – Additional information about cache file, can be null.
missing – Which value to represent missing value.
out – The created DMatrix
- Returns:
0 when success, -1 when failure happens.
-
int XGProxyDMatrixCreate(DMatrixHandle *out)
Create a DMatrix proxy for setting data, can be free by XGDMatrixFree.
Second set of callback functions, used by constructing Quantile DMatrix or external memory DMatrix using custom iterator.
- Parameters:
out – The created Device Quantile DMatrix
- Returns:
0 when success, -1 when failure happens
-
int XGDMatrixCreateFromCallback(DataIterHandle iter, DMatrixHandle proxy, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, char const *config, DMatrixHandle *out)
Create an external memory DMatrix with data iterator.
Short note for how to use second set of callback for external memory data support:
Step 0: Define a data iterator with 2 methods
reset
, andnext
.Step 1: Create a DMatrix proxy by XGProxyDMatrixCreate and hold the handle.
Step 2: Pass the iterator handle, proxy handle and 2 methods into XGDMatrixCreateFromCallback, along with other parameters encoded as a JSON object.
Step 3: Call appropriate data setters in
next
functions.
- Parameters:
iter – A handle to external data iterator.
proxy – A DMatrix proxy handle created by XGProxyDMatrixCreate.
reset – Callback function resetting the iterator state.
next – Callback function yielding the next batch of data.
config – JSON encoded parameters for DMatrix construction. Accepted fields are:
missing: Which value to represent missing value
cache_prefix: The path of cache file, caller must initialize all the directories in this path.
nthread (optional): Number of threads used for initializing DMatrix.
out – [out] The created external memory DMatrix
- Returns:
0 when success, -1 when failure happens
-
int XGQuantileDMatrixCreateFromCallback(DataIterHandle iter, DMatrixHandle proxy, DataIterHandle ref, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, char const *config, DMatrixHandle *out)
Create a Quantile DMatrix with data iterator.
Short note for how to use the second set of callback for (GPU)Hist tree method:
Step 0: Define a data iterator with 2 methods
reset
, andnext
.Step 1: Create a DMatrix proxy by XGProxyDMatrixCreate and hold the handle.
Step 2: Pass the iterator handle, proxy handle and 2 methods into
XGQuantileDMatrixCreateFromCallback
.Step 3: Call appropriate data setters in
next
functions.
See test_iterative_dmatrix.cu or Python interface for examples.
- Parameters:
iter – A handle to external data iterator.
proxy – A DMatrix proxy handle created by XGProxyDMatrixCreate.
ref – Reference DMatrix for providing quantile information.
reset – Callback function resetting the iterator state.
next – Callback function yielding the next batch of data.
config – JSON encoded parameters for DMatrix construction. Accepted fields are:
missing: Which value to represent missing value
nthread (optional): Number of threads used for initializing DMatrix.
max_bin (optional): Maximum number of bins for building histogram. Must be consistent with the corresponding booster training parameter.
out – The created Quantile DMatrix.
- Returns:
0 when success, -1 when failure happens
-
int XGExtMemQuantileDMatrixCreateFromCallback(DataIterHandle iter, DMatrixHandle proxy, DataIterHandle ref, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, char const *config, DMatrixHandle *out)
Create a Quantile DMatrix backed by external memory.
- Since
3.0.0
Note
This is still under development, not ready for test yet.
- Parameters:
iter – A handle to external data iterator.
proxy – A DMatrix proxy handle created by XGProxyDMatrixCreate.
ref – Reference DMatrix for providing quantile information.
reset – Callback function resetting the iterator state.
next – Callback function yielding the next batch of data.
config – JSON encoded parameters for DMatrix construction. Accepted fields are:
missing: Which value to represent missing value
cache_prefix: The path of cache file, caller must initialize all the directories in this path.
nthread (optional): Number of threads used for initializing DMatrix.
max_bin (optional): Maximum number of bins for building histogram. Must be consistent with the corresponding booster training parameter.
on_host (optional): Whether the data should be placed on host memory. Used by GPU inputs.
out – The created Quantile DMatrix.
- Returns:
0 when success, -1 when failure happens
-
int XGDeviceQuantileDMatrixCreateFromCallback(DataIterHandle iter, DMatrixHandle proxy, DataIterResetCallback *reset, XGDMatrixCallbackNext *next, float missing, int nthread, int max_bin, DMatrixHandle *out)
Create a Device Quantile DMatrix with data iterator.
- Deprecated:
since 1.7.0
-
int XGProxyDMatrixSetDataCudaArrayInterface(DMatrixHandle handle, const char *c_interface_str)
Set data on a DMatrix proxy.
- Parameters:
handle – A DMatrix proxy created by XGProxyDMatrixCreate
c_interface_str – Null terminated JSON document string representation of CUDA array interface.
- Returns:
0 when success, -1 when failure happens
-
int XGProxyDMatrixSetDataColumnar(DMatrixHandle handle, char const *c_interface_str)
Set columnar (table) data on a DMatrix proxy.
- Parameters:
handle – A DMatrix proxy created by XGProxyDMatrixCreate
c_interface_str – See XGBoosterPredictFromColumnar for details.
- Returns:
0 when success, -1 when failure happens
-
int XGProxyDMatrixSetDataCudaColumnar(DMatrixHandle handle, const char *c_interface_str)
Set data on a DMatrix proxy.
- Parameters:
handle – A DMatrix proxy created by XGProxyDMatrixCreate
c_interface_str – Null terminated JSON document string representation of CUDA array interface, with an array of columns.
- Returns:
0 when success, -1 when failure happens
-
int XGProxyDMatrixSetDataDense(DMatrixHandle handle, char const *c_interface_str)
Set data on a DMatrix proxy.
- Parameters:
handle – A DMatrix proxy created by XGProxyDMatrixCreate
c_interface_str – Null terminated JSON document string representation of array interface.
- Returns:
0 when success, -1 when failure happens
-
int XGProxyDMatrixSetDataCSR(DMatrixHandle handle, char const *indptr, char const *indices, char const *data, bst_ulong ncol)
Set data on a DMatrix proxy.
- Parameters:
handle – A DMatrix proxy created by XGProxyDMatrixCreate
indptr – JSON encoded array_interface to row pointer in CSR.
indices – JSON encoded array_interface to column indices in CSR.
data – JSON encoded array_interface to values in CSR..
ncol – The number of columns of input CSR matrix.
- Returns:
0 when success, -1 when failure happens
-
struct XGBoostBatchCSR
- #include <c_api.h>
Mini batch used in XGBoost Data Iteration.
Booster
- group Booster
The
Booster
class is the gradient-boosted model for XGBoost.Functions
-
int XGBoosterCreate(const DMatrixHandle dmats[], bst_ulong len, BoosterHandle *out)
create xgboost learner
- Parameters:
dmats – matrices that are set to be cached
len – length of dmats
out – handle to the result booster
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterFree(BoosterHandle handle)
free obj in handle
- Parameters:
handle – handle to be freed
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSlice(BoosterHandle handle, int begin_layer, int end_layer, int step, BoosterHandle *out)
Slice a model using boosting index. The slice m:n indicates taking all trees that were fit during the boosting rounds m, (m+1), (m+2), …, (n-1).
- Parameters:
handle – Booster to be sliced.
begin_layer – start of the slice
end_layer – end of the slice; end_layer=0 is equivalent to end_layer=num_boost_round
step – step size of the slice
out – Sliced booster.
- Returns:
0 when success, -1 when failure happens, -2 when index is out of bound.
-
int XGBoosterBoostedRounds(BoosterHandle handle, int *out)
Get number of boosted rounds from gradient booster. When process_type is update, this number might drop due to removed tree.
- Parameters:
handle – Handle to booster.
out – Pointer to output integer.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSetParam(BoosterHandle handle, const char *name, const char *value)
set parameters
- Parameters:
handle – handle
name – parameter name
value – value of parameter
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterGetNumFeature(BoosterHandle handle, bst_ulong *out)
get number of features
- Parameters:
handle – Handle to booster.
out – number of features
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterUpdateOneIter(BoosterHandle handle, int iter, DMatrixHandle dtrain)
update the model in one round using dtrain
- Parameters:
handle – handle
iter – current iteration rounds
dtrain – training data
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterBoostOneIter(BoosterHandle handle, DMatrixHandle dtrain, float *grad, float *hess, bst_ulong len)
- Deprecated:
since 2.1.0
-
int XGBoosterTrainOneIter(BoosterHandle handle, DMatrixHandle dtrain, int iter, char const *grad, char const *hess)
Update a model with gradient and Hessian. This is used for training with a custom objective function.
- Since
2.0.0
- Parameters:
handle – handle
dtrain – The training data.
iter – The current iteration round. When training continuation is used, the count should restart.
grad – Json encoded __(cuda)_array_interface__ for gradient.
hess – Json encoded __(cuda)_array_interface__ for Hessian.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterEvalOneIter(BoosterHandle handle, int iter, DMatrixHandle dmats[], const char *evnames[], bst_ulong len, const char **out_result)
get evaluation statistics for xgboost
- Parameters:
handle – handle
iter – current iteration rounds
dmats – pointers to data to be evaluated
evnames – pointers to names of each data
len – length of dmats
out_result – the string containing evaluation statistics
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterDumpModel(BoosterHandle handle, const char *fmap, int with_stats, bst_ulong *out_len, const char ***out_dump_array)
dump model, return array of strings representing model dump
- Parameters:
handle – handle
fmap – name to fmap can be empty string
with_stats – whether to dump with statistics
out_len – length of output array
out_dump_array – pointer to hold representing dump of each model
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterDumpModelEx(BoosterHandle handle, const char *fmap, int with_stats, const char *format, bst_ulong *out_len, const char ***out_dump_array)
dump model, return array of strings representing model dump
- Parameters:
handle – handle
fmap – name to fmap can be empty string
with_stats – whether to dump with statistics
format – the format to dump the model in
out_len – length of output array
out_dump_array – pointer to hold representing dump of each model
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterDumpModelWithFeatures(BoosterHandle handle, int fnum, const char **fname, const char **ftype, int with_stats, bst_ulong *out_len, const char ***out_models)
dump model, return array of strings representing model dump
- Parameters:
handle – handle
fnum – number of features
fname – names of features
ftype – types of features
with_stats – whether to dump with statistics
out_len – length of output array
out_models – pointer to hold representing dump of each model
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterDumpModelExWithFeatures(BoosterHandle handle, int fnum, const char **fname, const char **ftype, int with_stats, const char *format, bst_ulong *out_len, const char ***out_models)
dump model, return array of strings representing model dump
- Parameters:
handle – handle
fnum – number of features
fname – names of features
ftype – types of features
with_stats – whether to dump with statistics
format – the format to dump the model in
out_len – length of output array
out_models – pointer to hold representing dump of each model
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterGetAttr(BoosterHandle handle, const char *key, const char **out, int *success)
Get string attribute from Booster.
- Parameters:
handle – handle
key – The key of the attribute.
out – The result attribute, can be NULL if the attribute do not exist.
success – Whether the result is contained in out.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSetAttr(BoosterHandle handle, const char *key, const char *value)
Set or delete string attribute.
- Parameters:
handle – handle
key – The key of the attribute.
value – The value to be saved. If nullptr, the attribute would be deleted.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterGetAttrNames(BoosterHandle handle, bst_ulong *out_len, const char ***out)
Get the names of all attribute from Booster.
- Parameters:
handle – handle
out_len – the argument to hold the output length
out – pointer to hold the output attribute stings
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSetStrFeatureInfo(BoosterHandle handle, const char *field, const char **features, const bst_ulong size)
Set string encoded feature info in Booster, similar to the feature info in DMatrix.
Accepted fields are:
feature_name
feature_type
- Parameters:
handle – An instance of Booster
field – Field name
features – Pointer to array of strings.
size – Size of
features
pointer (number of strings passed in).
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterGetStrFeatureInfo(BoosterHandle handle, const char *field, bst_ulong *len, const char ***out_features)
Get string encoded feature info from Booster, similar to feature info in DMatrix.
Accepted fields are:
feature_name
feature_type
Caller is responsible for copying out the data, before next call to any API function of XGBoost.
- Parameters:
handle – An instance of Booster
field – Field name
len – Size of output pointer
features
(number of strings returned).out_features – Address of a pointer to array of strings. Result is stored in thread local memory.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterFeatureScore(BoosterHandle handle, const char *config, bst_ulong *out_n_features, char const ***out_features, bst_ulong *out_dim, bst_ulong const **out_shape, float const **out_scores)
Calculate feature scores for tree models. When used on linear model, only the
weight
importance type is defined, and output scores is a row major matrix with shape [n_features, n_classes] for multi-class model. For tree model, out_n_feature is always equal to out_n_scores and has multiple definitions of importance type.- Parameters:
handle – An instance of Booster
config – Parameters for computing scores encoded as JSON. Accepted JSON keys are:
importance_type: A JSON string with following possible values:
’weight’: the number of times a feature is used to split the data across all trees.
’gain’: the average gain across all splits the feature is used in.
’cover’: the average coverage across all splits the feature is used in.
’total_gain’: the total gain across all splits the feature is used in.
’total_cover’: the total coverage across all splits the feature is used in.
feature_map: An optional JSON string with URI or path to the feature map file.
feature_names: An optional JSON array with string names for each feature.
out_n_features – Length of output feature names.
out_features – An array of string as feature names, ordered the same as output scores.
out_dim – Dimension of output feature scores.
out_shape – Shape of output feature scores with length of
out_dim
.out_scores – An array of floating point as feature scores with shape of
out_shape
.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterCreate(const DMatrixHandle dmats[], bst_ulong len, BoosterHandle *out)
Prediction
- group Prediction
These functions are used for running prediction and explanation algorithms.
Functions
-
int XGBoosterPredict(BoosterHandle handle, DMatrixHandle dmat, int option_mask, unsigned ntree_limit, int training, bst_ulong *out_len, const float **out_result)
make prediction based on dmat (deprecated, use XGBoosterPredictFromDMatrix instead)
- Deprecated:
See also
- Parameters:
handle – handle
dmat – data matrix
option_mask – bit-mask of options taken in prediction, possible values 0:normal prediction 1:output margin instead of transformed value 2:output leaf index of trees instead of leaf value, note leaf index is unique per tree 4:output feature contributions to individual predictions
ntree_limit – limit number of trees used for prediction, this is only valid for boosted trees when the parameter is set to 0, we will use all the trees
training – Whether the prediction function is used as part of a training loop. Prediction can be run in 2 scenarios:
Given data matrix X, obtain prediction y_pred from the model.
Obtain the prediction for computing gradients. For example, DART booster performs dropout during training, and the prediction result will be different from the one obtained by normal inference step due to dropped trees. Set training=false for the first scenario. Set training=true for the second scenario. The second scenario applies when you are defining a custom objective function.
out_len – used to store length of returning result
out_result – used to set a pointer to array
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromDMatrix(BoosterHandle handle, DMatrixHandle dmat, char const *config, bst_ulong const **out_shape, bst_ulong *out_dim, float const **out_result)
Make prediction from DMatrix, replacing XGBoosterPredict.
“type”: [0, 6]
0: normal prediction
1: output margin
2: predict contribution
3: predict approximated contribution
4: predict feature interaction
5: predict approximated feature interaction
6: predict leaf “training”: bool Whether the prediction function is used as part of a training loop. Not used for inplace prediction.
Prediction can be run in 2 scenarios:
Given data matrix X, obtain prediction y_pred from the model.
Obtain the prediction for computing gradients. For example, DART booster performs dropout during training, and the prediction result will be different from the one obtained by normal inference step due to dropped trees. Set training=false for the first scenario. Set training=true for the second scenario. The second scenario applies when you are defining a custom objective function. “iteration_begin”: int Beginning iteration of prediction. “iteration_end”: int End iteration of prediction. Set to 0 this will become the size of tree model (all the trees). “strict_shape”: bool Whether should we reshape the output with stricter rules. If set to true, normal/margin/contrib/interaction predict will output consistent shape disregarding the use of multi-class model, and leaf prediction will output 4-dim array representing: (n_samples, n_iterations, n_classes, n_trees_in_forest)
Example JSON input for running a normal prediction with strict output shape, 2 dim for softprob , 1 dim for others.
{ "type": 0, "training": false, "iteration_begin": 0, "iteration_end": 0, "strict_shape": true }
See also
XGBoosterPredictFromDense XGBoosterPredictFromCSR XGBoosterPredictFromCudaArray XGBoosterPredictFromCudaColumnar
- Parameters:
handle – Booster handle
dmat – DMatrix handle
config – String encoded predict configuration in JSON format, with following available fields in the JSON object:
out_shape – Shape of output prediction (copy before use).
out_dim – Dimension of output prediction.
out_result – Buffer storing prediction value (copy before use).
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromDense(BoosterHandle handle, char const *values, char const *config, DMatrixHandle m, bst_ulong const **out_shape, bst_ulong *out_dim, const float **out_result)
Inplace prediction from CPU dense matrix.
Note
If the booster is configured to run on a CUDA device, XGBoost falls back to run prediction with DMatrix with a performance warning.
- Parameters:
handle – Booster handle.
values – JSON encoded array_interface to values.
config – See XGBoosterPredictFromDMatrix for more info. Additional fields for inplace prediction are:
”missing”: float
m – An optional (NULL if not available) proxy DMatrix instance storing meta info.
out_shape – See XGBoosterPredictFromDMatrix for more info.
out_dim – See XGBoosterPredictFromDMatrix for more info.
out_result – See XGBoosterPredictFromDMatrix for more info.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromColumnar(BoosterHandle handle, char const *values, char const *config, DMatrixHandle m, bst_ulong const **out_shape, bst_ulong *out_dim, const float **out_result)
Inplace prediction from CPU columnar data. (Table)
Note
If the booster is configured to run on a CUDA device, XGBoost falls back to run prediction with DMatrix with a performance warning.
- Parameters:
handle – Booster handle.
values – An JSON array of array_interface for each column.
config – See XGBoosterPredictFromDMatrix for more info. Additional fields for inplace prediction are:
”missing”: float
m – An optional (NULL if not available) proxy DMatrix instance storing meta info.
out_shape – See XGBoosterPredictFromDMatrix for more info.
out_dim – See XGBoosterPredictFromDMatrix for more info.
out_result – See XGBoosterPredictFromDMatrix for more info.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromCSR(BoosterHandle handle, char const *indptr, char const *indices, char const *values, bst_ulong ncol, char const *config, DMatrixHandle m, bst_ulong const **out_shape, bst_ulong *out_dim, const float **out_result)
Inplace prediction from CPU CSR matrix.
Note
If the booster is configured to run on a CUDA device, XGBoost falls back to run prediction with DMatrix with a performance warning.
- Parameters:
handle – Booster handle.
indptr – JSON encoded array_interface to row pointer in CSR.
indices – JSON encoded array_interface to column indices in CSR.
values – JSON encoded array_interface to values in CSR..
ncol – Number of features in data.
config – See XGBoosterPredictFromDMatrix for more info. Additional fields for inplace prediction are:
”missing”: float
m – An optional (NULL if not available) proxy DMatrix instance storing meta info.
out_shape – See XGBoosterPredictFromDMatrix for more info.
out_dim – See XGBoosterPredictFromDMatrix for more info.
out_result – See XGBoosterPredictFromDMatrix for more info.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromCudaArray(BoosterHandle handle, char const *values, char const *config, DMatrixHandle m, bst_ulong const **out_shape, bst_ulong *out_dim, const float **out_result)
Inplace prediction from CUDA Dense matrix (cupy in Python).
Note
If the booster is configured to run on a CPU, XGBoost falls back to run prediction with DMatrix with a performance warning.
- Parameters:
handle – Booster handle
values – JSON encoded cuda_array_interface to values.
config – See XGBoosterPredictFromDMatrix for more info. Additional fields for inplace prediction are:
”missing”: float
m – An optional (NULL if not available) proxy DMatrix instance storing meta info.
out_shape – See XGBoosterPredictFromDMatrix for more info.
out_dim – See XGBoosterPredictFromDMatrix for more info.
out_result – See XGBoosterPredictFromDMatrix for more info.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredictFromCudaColumnar(BoosterHandle handle, char const *values, char const *config, DMatrixHandle m, bst_ulong const **out_shape, bst_ulong *out_dim, const float **out_result)
Inplace prediction from CUDA dense dataframe (cuDF in Python).
Note
If the booster is configured to run on a CPU, XGBoost falls back to run prediction with DMatrix with a performance warning.
- Parameters:
handle – Booster handle
values – List of cuda_array_interface for all columns encoded in JSON list.
config – See XGBoosterPredictFromDMatrix for more info. Additional fields for inplace prediction are:
”missing”: float
m – An optional (NULL if not available) proxy DMatrix instance storing meta info.
out_shape – See XGBoosterPredictFromDMatrix for more info.
out_dim – See XGBoosterPredictFromDMatrix for more info.
out_result – See XGBoosterPredictFromDMatrix for more info.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterPredict(BoosterHandle handle, DMatrixHandle dmat, int option_mask, unsigned ntree_limit, int training, bst_ulong *out_len, const float **out_result)
Serialization
- group Serialization
There are multiple ways to serialize a Booster object depending on the use case.
Short note for serialization APIs. There are 3 different sets of serialization API.
Functions with the term “Model” handles saving/loading XGBoost model like trees or linear weights. Striping out parameters configuration like training algorithms or CUDA device ID. These functions are designed to let users reuse the trained model for different tasks, examples are prediction, training continuation or model interpretation.
Functions with the term “Config” handles save/loading configuration. It helps user to study the internal of XGBoost. Also user can use the load method for specifying parameters in a structured way. These functions are introduced in 1.0.0, and are not yet stable.
Functions with the term “Serialization” are combined of above two. They are used in situations like check-pointing, or continuing training task in distributed environment. In these cases the task must be carried out without any user intervention.
Functions
-
int XGBoosterLoadModel(BoosterHandle handle, const char *fname)
Load model from existing file.
- Parameters:
handle – handle
fname – File URI or file name. The string must be UTF-8 encoded.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSaveModel(BoosterHandle handle, const char *fname)
Save model into existing file.
- Parameters:
handle – handle
fname – File URI or file name. The string must be UTF-8 encoded.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterLoadModelFromBuffer(BoosterHandle handle, const void *buf, bst_ulong len)
load model from in memory buffer
- Parameters:
handle – handle
buf – pointer to the buffer
len – the length of the buffer
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSaveModelToBuffer(BoosterHandle handle, char const *config, bst_ulong *out_len, char const **out_dptr)
Save model into raw bytes, return header of the array. User must copy the result out, before next xgboost call.
- Parameters:
handle – handle
config – JSON encoded string storing parameters for the function. Following keys are expected in the JSON document:
”format”: str
json: Output booster will be encoded as JSON.
ubj: Output booster will be encoded as Universal binary JSON.
deprecated: Output booster will be encoded as old custom binary format. Do not use this format except for compatibility reasons.
out_len – The argument to hold the output length
out_dptr – The argument to hold the output data pointer
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSerializeToBuffer(BoosterHandle handle, bst_ulong *out_len, const char **out_dptr)
Memory snapshot based serialization method. Saves everything states into buffer.
- Parameters:
handle – handle
out_len – the argument to hold the output length
out_dptr – the argument to hold the output data pointer
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterUnserializeFromBuffer(BoosterHandle handle, const void *buf, bst_ulong len)
Memory snapshot based serialization method. Loads the buffer returned from XGBoosterSerializeToBuffer.
- Parameters:
handle – handle
buf – pointer to the buffer
len – the length of the buffer
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterSaveJsonConfig(BoosterHandle handle, bst_ulong *out_len, char const **out_str)
Save XGBoost’s internal configuration into a JSON document. Currently the support is experimental, function signature may change in the future without notice.
- Parameters:
handle – handle to Booster object.
out_len – length of output string
out_str – A valid pointer to array of characters. The characters array is allocated and managed by XGBoost, while pointer to that array needs to be managed by caller.
- Returns:
0 when success, -1 when failure happens
-
int XGBoosterLoadJsonConfig(BoosterHandle handle, char const *config)
Load XGBoost’s internal configuration from a JSON document. Currently the support is experimental, function signature may change in the future without notice.
- Parameters:
handle – handle to Booster object.
config – string representation of a JSON document.
- Returns:
0 when success, -1 when failure happens
Collective
- group Collective
Experimental support for exposing internal communicator in XGBoost.
The collective communicator in XGBoost evolved from the
rabit
project of dmlc but has changed significantly since its adoption. It consists of a tracker and a set of workers. The tracker is responsible for bootstrapping the communication group and handling centralized tasks like logging. The workers are actual communicators performing collective tasks like allreduce.To use the collective implementation, one needs to first create a tracker with corresponding parameters, then get the arguments for workers using XGTrackerWorkerArgs(). The obtained arguments can then be passed to the XGCommunicatorInit() function. Call to XGCommunicatorInit() must be accompanied with a XGCommunicatorFinalize() call for cleanups. Please note that the communicator uses
std::thread
in C++, which has undefined behavior in a C++ destructor due to the runtime shutdown sequence. It’s preferable to call XGCommunicatorFinalize() before the runtime is shutting down. This requirement is similar to a Python thread or socket, which should not be relied upon in a__del__
function.Since it’s used as a part of XGBoost, errors will be returned when a XGBoost function is called, for instance, training a booster might return a connection error.
Note
This is still under development.
Typedefs
-
typedef void *TrackerHandle
Handle to the tracker.
There are currently two types of tracker in XGBoost, first one is
rabit
, while the other one isfederated
.rabit
is used for normal collective communication, whilefederated
is used for federated learning.
Functions
-
int XGTrackerCreate(char const *config, TrackerHandle *handle)
Create a new tracker.
dmlc_communicator: String, the type of tracker to create. Available options are
rabit
andfederated
. See TrackerHandle for more info.n_workers: Integer, the number of workers.
port: (Optional) Integer, the port this tracker should listen to.
timeout: (Optional) Integer, timeout in seconds for various networking operations. Default is 300 seconds.
Some configurations are
rabit
specific:host: (Optional) String, Used by the the
rabit
tracker to specify the address of the host. This can be useful when the communicator cannot reliably obtain the host address.sortby: (Optional) Integer.
0: Sort workers by their host name.
1: Sort workers by task IDs.
Some
federated
specific configurations:federated_secure: Boolean, whether this is a secure server. False for testing.
server_key_path: Path to the server key. Used only if this is a secure server.
server_cert_path: Path to the server certificate. Used only if this is a secure server.
client_cert_path: Path to the client certificate. Used only if this is a secure server.
- Parameters:
config – JSON encoded parameters.
handle – The handle to the created tracker.
- Returns:
0 for success, -1 for failure.
-
int XGTrackerWorkerArgs(TrackerHandle handle, char const **args)
Get the arguments needed for running workers. This should be called after XGTrackerRun().
- Parameters:
handle – The handle to the tracker.
args – The arguments returned as a JSON document.
- Returns:
0 for success, -1 for failure.
-
int XGTrackerRun(TrackerHandle handle, char const *config)
Start the tracker. The tracker runs in the background and this function returns once the tracker is started.
- Parameters:
handle – The handle to the tracker.
config – Unused at the moment, preserved for the future.
- Returns:
0 for success, -1 for failure.
-
int XGTrackerWaitFor(TrackerHandle handle, char const *config)
Wait for the tracker to finish, should be called after XGTrackerRun(). This function will block until the tracker task is finished or timeout is reached.
- Parameters:
handle – The handle to the tracker.
config – JSON encoded configuration. No argument is required yet, preserved for the future.
- Returns:
0 for success, -1 for failure.
-
int XGTrackerFree(TrackerHandle handle)
Free a tracker instance. This should be called after XGTrackerWaitFor(). If the tracker is not properly waited, this function will shutdown all connections with the tracker, potentially leading to undefined behavior.
- Parameters:
handle – The handle to the tracker.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorInit(char const *config)
Initialize the collective communicator.
Currently the communicator API is experimental, function signatures may change in the future without notice.
Call this once in the worker process before using anything. Please make sure XGCommunicatorFinalize() is called after use. The initialized commuicator is a global thread-local variable.
Only applicable to the
rabit
communicator:dmlc_tracker_uri: Hostname or IP address of the tracker.
dmlc_tracker_port: Port number of the tracker.
dmlc_task_id: ID of the current task, can be used to obtain deterministic rank assignment.
dmlc_retry: The number of retries for connection failure.
dmlc_timeout: Timeout in seconds.
dmlc_nccl_path: Path to the nccl shared library
libnccl.so
.
Only applicable to the
federated
communicator (use upper case for environment variables, use lower case for runtime configuration):federated_server_address: Address of the federated server.
federated_world_size: Number of federated workers.
federated_rank: Rank of the current worker.
federated_server_cert_path: Server certificate file path. Only needed for the SSL mode.
federated_client_key_path: Client key file path. Only needed for the SSL mode.
federated_client_cert_path: Client certificate file path. Only needed for the SSL mode.
- Parameters:
config – JSON encoded configuration. Accepted JSON keys are:
dmlc_communicator: The type of the communicator, this should match the tracker type.
rabit: Use Rabit. This is the default if the type is unspecified.
federated: Use the gRPC interface for Federated Learning.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorFinalize(void)
Finalize the collective communicator.
Call this function after you have finished all jobs.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorGetRank(void)
Get rank of the current process.
- Returns:
Rank of the worker.
-
int XGCommunicatorGetWorldSize(void)
Get the total number of processes.
- Returns:
Total world size.
-
int XGCommunicatorIsDistributed(void)
Get if the communicator is distributed.
- Returns:
True if the communicator is distributed.
-
int XGCommunicatorPrint(char const *message)
Print the message to the tracker.
This function can be used to communicate the information of the progress to the user who monitors the tracker.
- Parameters:
message – The message to be printed.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorGetProcessorName(const char **name_str)
Get the name of the processor.
- Parameters:
name_str – Pointer to received returned processor name.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorBroadcast(void *send_receive_buffer, size_t size, int root)
Broadcast a memory region to all others from root. This function is NOT thread-safe.
Example:
int a = 1; Broadcast(&a, sizeof(a), root);
- Parameters:
send_receive_buffer – Pointer to the send or receive buffer.
size – Size of the data in bytes.
root – The process rank to broadcast from.
- Returns:
0 for success, -1 for failure.
-
int XGCommunicatorAllreduce(void *send_receive_buffer, size_t count, int data_type, int op)
Perform in-place allreduce. This function is NOT thread-safe.
Example Usage: the following code gives sum of the result
enum class Op { kMax = 0, kMin = 1, kSum = 2, kBitwiseAND = 3, kBitwiseOR = 4, kBitwiseXOR = 5 }; std::vector<int> data(10); ... Allreduce(data.data(), data.size(), DataType:kInt32, Op::kSum); ...
- Parameters:
send_receive_buffer – Buffer for both sending and receiving data.
count – Number of elements to be reduced.
data_type – Enumeration of data type, see xgboost::collective::DataType in communicator.h.
op – Enumeration of operation type, see xgboost::collective::Operation in communicator.h.
- Returns:
0 for success, -1 for failure.
-
typedef void *TrackerHandle