Stepwise Regression - Selection Criterion

Selection Criterion

One of the main issues with stepwise regression is that it searches a large space of possible models. Hence it is prone to overfitting the data. In other words, stepwise regression will often fit much better in sample than it does on new out-of-sample data. This problem can be mitigated if the criterion for adding (or deleting) a variable is stiff enough. The key line in the sand is at what can be thought of as the Bonferroni point: namely how significant the best spurious variable should be based on chance alone. On a t-statistic scale, this occurs at about, where p is the number of predictors. Unfortunately, this means that many variables which actually carry signal will not be included. This fence turns out to be the right trade-off between over-fitting and missing signal. If we look at the risk of different cutoffs, then using this bound will be within a 2logp factor of the best possible risk. Any other cutoff will end up having a larger such risk inflation.

Read more about this topic:  Stepwise Regression

Famous quotes containing the words selection and/or criterion:

    It is the highest and most legitimate pride of an Englishman to have the letters M.P. written after his name. No selection from the alphabet, no doctorship, no fellowship, be it of ever so learned or royal a society, no knightship,—not though it be of the Garter,—confers so fair an honour.
    Anthony Trollope (1815–1882)

    If we are to take for the criterion of truth the majority of suffrages, they ought to be gotten from those philosophic and patriotic citizens who cultivate their reason.
    James Madison (1751–1836)