# Feature importance Creates a `data.table` of feature importances. ## Usage ``` r xgb.importance( model = NULL, feature_names = getinfo(model, "feature_name"), trees = NULL ) ``` ## Arguments - model: Object of class `xgb.Booster`. - feature_names: Character vector used to overwrite the feature names of the model. The default is `NULL` (use original feature names). - trees: An integer vector of (base-1) tree indices that should be included into the importance calculation (only for the "gbtree" booster). The default (`NULL`) parses all trees. It could be useful, e.g., in multiclass classification to get feature importances for each class separately. ## Value A `data.table` with the following columns: For a tree model: - `Features`: Names of the features used in the model. - `Gain`: Fractional contribution of each feature to the model based on the total gain of this feature's splits. Higher percentage means higher importance. - `Cover`: Metric of the number of observation related to this feature. - `Frequency`: Percentage of times a feature has been used in trees. For a linear model: - `Features`: Names of the features used in the model. - `Weight`: Linear coefficient of this feature. - `Class`: Class label (only for multiclass models). For objects of class `xgboost` (as produced by [`xgboost()`](https://github.com/dmlc/xgboost/reference/xgboost.md)), it will be a `factor`, while for objects of class `xgb.Booster` (as produced by [`xgb.train()`](https://github.com/dmlc/xgboost/reference/xgb.train.md)), it will be a zero-based integer vector. If `feature_names` is not provided and `model` doesn't have `feature_names`, the index of the features will be used instead. Because the index is extracted from the model dump (based on C++ code), it starts at 0 (as in C/C++ or Python) instead of 1 (usual in R). ## Details This function works for both linear and tree models. For linear models, the importance is the absolute magnitude of linear coefficients. To obtain a meaningful ranking by importance for linear models, the features need to be on the same scale (which is also recommended when using L1 or L2 regularization). ## Examples ``` r # binary classification using "gbtree": data("ToothGrowth") x <- ToothGrowth[, c("len", "dose")] y <- ToothGrowth$supp model_tree_binary <- xgboost( x, y, nrounds = 5L, nthreads = 1L, booster = "gbtree", max_depth = 2L ) xgb.importance(model_tree_binary) # binary classification using "gblinear": model_tree_linear <- xgboost( x, y, nrounds = 5L, nthreads = 1L, booster = "gblinear", learning_rate = 0.3 ) xgb.importance(model_tree_linear) # multi-class classification using "gbtree": data("iris") x <- iris[, c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")] y <- iris$Species model_tree_multi <- xgboost( x, y, nrounds = 5L, nthreads = 1L, booster = "gbtree", max_depth = 3 ) # all classes clumped together: xgb.importance(model_tree_multi) # inspect importances separately for each class: num_classes <- 3L nrounds <- 5L xgb.importance( model_tree_multi, trees = seq(from = 1, by = num_classes, length.out = nrounds) ) xgb.importance( model_tree_multi, trees = seq(from = 2, by = num_classes, length.out = nrounds) ) xgb.importance( model_tree_multi, trees = seq(from = 3, by = num_classes, length.out = nrounds) ) # multi-class classification using "gblinear": model_linear_multi <- xgboost( x, y, nrounds = 5L, nthreads = 1L, booster = "gblinear", learning_rate = 0.2 ) xgb.importance(model_linear_multi) ```