Mallows' Cp - Definition and Properties

Definition and Properties

Mallows' Cp addresses the issue of overfitting, in which model selection statistics such as the residual sum of squares always get smaller as more variables are added to a model. Thus, if we aim to select the model giving the smallest residual sum of squares, the model including all variables would always be selected. The Cp statistic calculated on a sample of data estimates the mean squared prediction error (MSPE) as its population target


E\sum_j (\hat{Y}_j - E(Y_j|X_j))^2/\sigma^2,

where is the fitted value from the regression model for the jth case, E(Yj | Xj) is the expected value for the jth case, and σ2 is the error variance (assumed constant across the cases). The MSPE will not automatically get smaller as more variables are added. The optimum model under this criterion is a compromise influenced by the sample size, the effect sizes of the different predictors, and the degree of collinearity between them.

If P regressors are selected from a set of K > P, the Cp statistic for that particular set of regressors is defined as :

where

  • is the error sum of squares for the model with P regressors,
  • Ypi is the predicted value of the i'th observation of Y from the P regressors,
  • S2 is the residual mean square after regression on the complete set of K regressors and can be estimated by mean square error MSE,
  • and N is the sample size.

Read more about this topic:  Mallows' Cp

Famous quotes containing the words definition and/or properties:

    ... if, as women, we accept a philosophy of history that asserts that women are by definition assimilated into the male universal, that we can understand our past through a male lens—if we are unaware that women even have a history—we live our lives similarly unanchored, drifting in response to a veering wind of myth and bias.
    Adrienne Rich (b. 1929)

    The reason why men enter into society, is the preservation of their property; and the end why they choose and authorize a legislative, is, that there may be laws made, and rules set, as guards and fences to the properties of all the members of the society: to limit the power, and moderate the dominion, of every part and member of the society.
    John Locke (1632–1704)