An mlr3::LearnerClassif that fits a Guided Regularized Random Forest (GRRF) for use in mlr3filters::FilterImportance-based feature selection.
The learner implements a two-stage approach:
An initial unregularized random forest estimates variable importance scores.
Those scores are normalized and used to compute per-feature split-selection weights (
coefReg), penalizing less informative predictors.A second "guided regularized" forest is trained with those weights, and its impurity importance scores are exposed via
$importance().
Class-balanced sample weights are computed automatically from the target column, so no external weight vector is required.
Parameters
gamma(numeric [0, 1], default0.5): Guidance coefficient.0= unguided regularized forest (equal penalty for all features);1= strongest guiding effect (most important features penalized least).num.trees(integer >= 1, default500): Number of trees in each forest.max.depth(integer >= 0, default100): Maximum tree depth (0= unlimited).
References
Deng, H., & Runger, G. (2013). Gene selection with guided regularized random forest. Pattern Recognition, 46(12), 3483-3489. https://doi.org/10.1016/j.patcog.2013.05.018 and Wundervald, B. et al. (2020). Generalizing Gain Penalization for Feature Selection in Tree-Based Models. IEEE Access, Vol. 8, 190231 - 190239. https://doi.org/10.1109/ACCESS.2020.3032095
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifGrrf
Methods
Inherited methods
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()
LearnerClassifGrrf$new()
Initialise the learner with its parameter set.
Usage
LearnerClassifGrrf$new()Examples
if (FALSE) { # \dontrun{
library(mlr3)
library(mlr3filters)
learner <- LearnerClassifGrrf$new()
learner$param_set$values <- list(gamma = 0.9, num.trees = 50L)
filter <- mlr3filters::FilterImportance$new(learner = learner)
task <- mlr3::tsk("sonar")
filter$calculate(task)
as.data.table(filter)
} # }