Feature Selection - Regularized Trees

Regularized Trees

The features from a decision tree or a tree ensemble are shown to be redundant. A recent method called regularized tree can be used for feature subset selection. Regularized trees penalize using a variable similar to the variables selected at previous tree nodes for splitting the current node. Regularized trees only need build one tree model (or one tree ensemble model) and thus are computationally efficient.

Regularized trees naturally handle numerical and categorical features, interactions and nonlinearities. They are invariant to attribute scales (units) and insensitive to outliers, and thus, require little data preprocessing such as normalization. Regularized random forest (RRF) (RRF) is one type of regularized trees. The guided RRF is an enhanced RRF which is guided by the importance scores from an ordinary random forest.

Read more about this topic:  Feature Selection

Famous quotes containing the word trees:

    The fish in neighboring streams and lakes are so voracious, it is said, that fishermen have to stand out of sight behind trees while baiting their hooks.
    —For the State of Florida, U.S. public relief program (1935-1943)