Degrees of Freedom (statistics) - Effective Degrees of Freedom

Effective Degrees of Freedom

Many regression methods, including ridge regression, linear smoothers and smoothing splines are not based on ordinary least squares projections, but rather on regularized (generalized and/or penalized) least-squares, and so degrees of freedom defined in terms of dimensionality is generally not useful for these procedures. However, these procedures are still linear in the observations, and the fitted values of the regression can be expressed in the form

where is the vector of fitted values at each of the original covariate values from the fitted model, y is the original vector of responses, and H is the hat matrix or, more generally, smoother matrix.

For statistical inference, sums-of-squares can still be formed: the model sum-of-squares is ; the residual sum-of-squares is . However, because H does not correspond to an ordinary least-squares fit (i.e. is not an orthogonal projection), these sums-of-squares no longer have (scaled, non-central) chi-squared distributions, and dimensionally defined degrees-of-freedom are not useful. The distribution is a generalized chi-squared distribution, and the theory associated with this distribution provides an alternative route to the answers provided by an effective degrees of freedom.

The effective degrees of freedom of the fit can be defined in various ways to implement goodness-of-fit tests, cross-validation and other inferential procedures. Here one can distinguish between regression effective degrees of freedom and residual effective degrees of freedom.

Regression effective degrees of freedom.

Regarding the former, appropriate definitions can include the trace of the hat matrix, tr(H), the trace of the quadratic form of the hat matrix, tr(H'H), the form tr(2H - H H'), or the Satterthwaite approximation, tr(H'H)2/tr(H'HH'H). In the case of linear regression, the hat matrix H is X(X 'X)−1X ', and all these definitions reduce to the usual degrees of freedom. Notice that

i.e., the regression (not residual) degrees of freedom in linear models are "the sum of the sensitivities of the fitted values with respect to the observed response values".

Residual effective degrees of freedom.

There are corresponding definitions of residual effective degrees-of-freedom (redf), with H replaced by I − H. For example, if the goal is to estimate error variance, the redf would be defined as tr((I − H)'(I − H)), and the unbiased estimate is (with ),

or:

The last approximation above reduces the computational cost from O(n2) to only O(n). In general the numerator would be the objective function being minimized; e.g., if the hat matrix includes an observation covariance matrix, Σ, then becomes .

General.

Note that unlike in the original case, non-integer degrees of freedom are allowed, though the value must usually still be constrained between 0 and n.

Consider, as an example, the k-nearest neighbour smoother, which is the average of the k nearest measured values to the given point. Then, at each of the n measured points, the weight of the original value on the linear combination that makes up the predicted value is just 1/k. Thus, the trace of the hat matrix is n/k. Thus the smooth costs n/k effective degrees of freedom.

As another example, consider the existence of nearly duplicated observations. Naive application of classical formula, n - p, would lead to over-estimation of the residuals degree of freedom, as if each observation were independent. More realistically, though, the hat matrix H = X(X ' Σ−1 X)−1X ' Σ−1 would involve an observation covariance matrix Σ indicating the non-zero correlation among observations. The more general formulation of effective degree of freedom would result in a more realistic estimate for, e.g., the error variance σ2.

Similar concepts are the equivalent degrees of freedom in non-parametric regression, the degree of freedom of signal in atmospheric studies, and the non-integer degree of freedom in geodesy.

Read more about this topic: Degrees Of Freedom (statistics)

Famous quotes containing the words effective, degrees and/or freedom:

“Society’s double behavioral standard for women and for men is, in fact, a more effective deterrent than economic discrimination because it is more insidious, less tangible. Economic disadvantages involve ascertainable amounts, but the very nature of societal value judgments makes them harder to define, their effects harder to relate.”
—Anne Tucker (b. 1945)

“For the profit of travel: in the first place, you get rid of a few prejudices.... The prejudiced against color finds several hundred millions of people of all shades of color, and all degrees of intellect, rank, and social worth, generals, judges, priests, and kings, and learns to give up his foolish prejudice.”
—Herman Melville (1819–1891)

“... the sentimentalist ... exclaims: “Would you have a woman step down from her pedestal in order to enter practical life?” Yes! A thousand times, yes! If we can really find, after a careful search, any women mounted upon pedestals, we should willingly ask them to step down in order that they may meet and help to uplift their sisters. Freedom and justice for all are infinitely more to be desired than pedestals for a few.”
—Bertha Honore Potter Palmer (1849–1918)