Feature Selection - Regularized Trees

Regularized Trees

The features from a decision tree or a tree ensemble are shown to be redundant. A recent method called regularized tree can be used for feature subset selection. Regularized trees penalize using a variable similar to the variables selected at previous tree nodes for splitting the current node. Regularized trees only need build one tree model (or one tree ensemble model) and thus are computationally efficient.

Regularized trees naturally handle numerical and categorical features, interactions and nonlinearities. They are invariant to attribute scales (units) and insensitive to outliers, and thus, require little data preprocessing such as normalization. Regularized random forest (RRF) (RRF) is one type of regularized trees. The guided RRF is an enhanced RRF which is guided by the importance scores from an ordinary random forest.

Read more about this topic: Feature Selection

Famous quotes containing the word trees:

“At twelve, the disintegration of afternoon
Began, the return to phantomerei, if not
To phantoms. Till then, it had been the other way:
One imagined the violet trees but the trees stood green,
At twelve, as green as ever they would be.
The sky was blue beyond the vaultiest phrase.”
—Wallace Stevens (1879–1955)