This CallbackTuning integrates early stopping into the hyperparameter tuning of an XGBoost learner.
Early stopping estimates the optimal number of trees (nrounds
) for a given hyperparameter configuration.
Since early stopping is performed in each resampling iteration, there are several optimal nrounds
values.
The callback writes the maximum value to the archive in the max_nrounds
column.
In the best hyperparameter configuration (instance$result_learner_param_vals
), the value of nrounds
is replaced by max_nrounds
and early stopping is deactivated.
Details
Currently, the callback does not work with GraphLearner
s from the package mlr3pipelines.
The callback is compatible with the AutoTuner.
The final model is fitted with the best hyperparameter configuration and max_nrounds
i.e. early stopping is not performed.
Resources
gallery post on early stopping with XGBoost.
Examples
clbk("mlr3tuning.early_stopping")
#> <CallbackTuning:mlr3tuning.early_stopping>: Early Stopping Callback
#> * Active Stages: on_eval_before_archive, on_eval_after_benchmark,
#> on_result, on_optimization_begin
# \donttest{
if (requireNamespace("mlr3learners") && requireNamespace("xgboost") ) {
library(mlr3learners)
# activate early stopping on the test set and set search space
learner = lrn("classif.xgboost",
eta = to_tune(1e-02, 1e-1, logscale = TRUE),
early_stopping_rounds = 5,
nrounds = 100,
early_stopping_set = "test")
# tune xgboost on the pima data set
instance = tune(
tuner = tnr("random_search"),
task = tsk("pima"),
learner = learner,
resampling = rsmp("cv", folds = 3),
measures = msr("classif.ce"),
term_evals = 10,
callbacks = clbk("mlr3tuning.early_stopping")
)
}
#> Loading required namespace: mlr3learners
#> Loading required namespace: xgboost
# }