Stepwise Regression - Selection Criterion

Selection Criterion

One of the main issues with stepwise regression is that it searches a large space of possible models. Hence it is prone to overfitting the data. In other words, stepwise regression will often fit much better in sample than it does on new out-of-sample data. This problem can be mitigated if the criterion for adding (or deleting) a variable is stiff enough. The key line in the sand is at what can be thought of as the Bonferroni point: namely how significant the best spurious variable should be based on chance alone. On a t-statistic scale, this occurs at about, where p is the number of predictors. Unfortunately, this means that many variables which actually carry signal will not be included. This fence turns out to be the right trade-off between over-fitting and missing signal. If we look at the risk of different cutoffs, then using this bound will be within a 2logp factor of the best possible risk. Any other cutoff will end up having a larger such risk inflation.

Read more about this topic:  Stepwise Regression

Famous quotes containing the words selection and/or criterion:

    Every writer is necessarily a critic—that is, each sentence is a skeleton accompanied by enormous activity of rejection; and each selection is governed by general principles concerning truth, force, beauty, and so on.... The critic that is in every fabulist is like the iceberg—nine-tenths of him is under water.
    Thornton Wilder (1897–1975)

    I divide all literary works into two categories: Those I like and those I don’t like. No other criterion exists for me.
    Anton Pavlovich Chekhov (1860–1904)