Visualizes distributions related to the depth of tree leaves.
xgb.plot.deepness()
uses base R graphics, whilexgb.ggplot.deepness()
uses "ggplot2".
Arguments
- model
Either an
xgb.Booster
model, or the "data.table" returned byxgb.model.dt.tree()
.- which
Which distribution to plot (see details).
- plot
Should the plot be shown? Default is
TRUE
.- ...
Other parameters passed to
graphics::barplot()
orgraphics::plot()
.
Value
The return value of the two functions is as follows:
xgb.plot.deepness()
: A "data.table" (invisibly). Each row corresponds to a terminal leaf in the model. It contains its information about depth, cover, and weight (used in calculating predictions). Ifplot = TRUE
, also a plot is shown.xgb.ggplot.deepness()
: Whenwhich = "2x1"
, a list of two "ggplot" objects, and a single "ggplot" object otherwise.
Details
When which = "2x1"
, two distributions with respect to the leaf depth
are plotted on top of each other:
The distribution of the number of leaves in a tree model at a certain depth.
The distribution of the average weighted number of observations ("cover") ending up in leaves at a certain depth.
Those could be helpful in determining sensible ranges of the max_depth
and min_child_weight
parameters.
When which = "max.depth"
or which = "med.depth"
, plots of either maximum or
median depth per tree with respect to the tree number are created.
Finally, which = "med.weight"
allows to see how
a tree's median absolute leaf weight changes through the iterations.
These functions have been inspired by the blog post https://github.com/aysent/random-forest-leaf-visualization.
Examples
data(agaricus.train, package = "xgboost")
## Keep the number of threads to 2 for examples
nthread <- 2
data.table::setDTthreads(nthread)
## Change max_depth to a higher number to get a more significant result
model <- xgboost(
agaricus.train$data, factor(agaricus.train$label),
nrounds = 50,
max_depth = 6,
nthreads = nthread,
subsample = 0.5,
min_child_weight = 2
)
xgb.plot.deepness(model)
xgb.ggplot.deepness(model)
xgb.plot.deepness(
model, which = "max.depth", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2
)
xgb.plot.deepness(
model, which = "med.weight", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2
)