Param for set checkpoint interval (>= 1) or disable checkpoint (-1).
Param for set checkpoint interval (>= 1) or disable checkpoint (-1). E.g. 10 means that
the trained model will get checkpointed every 10 iterations. Note: checkpoint_path
must
also be set if the checkpoint interval is greater than 0.
The hdfs folder to load and save checkpoint boosters.
The hdfs folder to load and save checkpoint boosters. default: empty_string
customized evaluation function provided by user.
customized evaluation function provided by user. default: null
customized objective function provided by user.
customized objective function provided by user. default: null
the value treated as missing.
the value treated as missing. default: Float.NaN
number of workers used to train xgboost model.
number of workers used to train xgboost model. default: 1
number of threads used by per worker.
number of threads used by per worker. default 1
The number of rounds for boosting
Random seed for the C++ part of XGBoost and train/test splitting.
0 means printing running messages, 1 means silent mode.
0 means printing running messages, 1 means silent mode. default: 0
the maximum time to wait for the job requesting new workers.
the maximum time to wait for the job requesting new workers. default: 30 minutes
Rabit tracker configurations.
Rabit tracker configurations. The parameter must be provided as an instance of the TrackerConf class, which has the following definition:
case class TrackerConf(workerConnectionTimeout: Duration, trainingTimeout: Duration, trackerImpl: String)
See below for detailed explanations.
Choice between "python" or "scala". The former utilizes the Java wrapper of the Python Rabit tracker (in dmlc_core), and does not support timeout settings. The "scala" version removes Python components, and fully supports timeout settings.
The timeout value should take the time of data loading and pre-processing into account, due to the lazy execution of Spark's operations. Alternatively, you may force Spark to perform data transformation before calling XGBoost.train(), so that this timeout truly reflects the connection delay. Set a reasonable timeout value to prevent model training/testing from hanging indefinitely, possible due to network issues. Note that zero timeout value means to wait indefinitely (equivalent to Duration.Inf). Ignored if the tracker implementation is "python".
whether to use external memory as cache.
whether to use external memory as cache. default: false