Computes Friedman's H-statistic to assess the strength of variable interactions.
Usage
interact(gbm_fit_obj, data, var_indices=1, num_trees=gbm_fit_obj$params$num_trees)
# S3 method for class 'GBMFit'
interact(
gbm_fit_obj,
data,
var_indices = 1,
num_trees = gbm_fit_obj$params$num_trees
)Arguments
- gbm_fit_obj
a
GBMFitobject fitted using a call togbmt.- data
the dataset used to construct
gbm_fit_obj. If the original dataset is large, a random subsample may be used to accelerate the computation.- var_indices
a vector of indices or the names of the variables for compute the interaction effect. If using indices, the variables are indexed in the same order that they appear in the initial
gbmtformula.- num_trees
the number of trees used to generate the plot. Only the first
num_treestrees will be used.
Details
interact.GBMFit computes Friedman's H-statistic to assess the relative
strength of interaction effects in non-linear models. H is on the scale of
[0-1] with higher values indicating larger interaction effects. To connect
to a more familiar measure, if \(x_1\) and \(x_2\) are uncorrelated
covariates with mean 0 and variance 1 and the model is of the form
$$y=\beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_3$$ then
$$H=\frac{\beta_3}{\sqrt{\beta_1^2+\beta_2^2+\beta_3^2}}$$
Note that if the main effects are weak, the estimated H will be unstable. For example, if (in the case of a two-way interaction) neither main effect is in the selected model (relative influence is zero), the result will be 0/0. Also, with weak main effects, rounding errors can result in values of H > 1 which are not possible.
References
J.H. Friedman and B.E. Popescu (2005). “Predictive Learning via Rule Ensembles.” Section 8.1
Author
Greg Ridgeway gregridgeway@gmail.com