Lack-of-fit Sum of Squares - Mathematical Details

Mathematical Details

Consider fitting a line with one predictor variable. Define i as an index of each of the n distinct x values, j as an index of the response variable observations for a given x value, and ni as the number of y values associated with the i th x value. The value of each response variable observation can be represented by

Let

be the least squares estimates of the unobservable parameters α and β based on the observed values of x i and Y i j.

Let

be the fitted values of the response variable. Then

are the residuals, which are observable estimates of the unobservable values of the error term ε ij. Because of the nature of the method of least squares, the whole vector of residuals, with

scalar components, necessarily satisfies the two constraints

It is thus constrained to lie in an (N − 2)-dimensional subspace of R N, i.e. there are N − 2 "degrees of freedom for error".

Now let

be the average of all Y-values associated with the i th x-value.

We partition the sum of squares due to error into two components:


\begin{align}
& \sum_{i=1}^n \sum_{j=1}^{n_i} \widehat\varepsilon_{ij}^{\,2}
= \sum_{i=1}^n \sum_{j=1}^{n_i} \left( Y_{ij} - \widehat Y_i \right)^2 \\
& = \underbrace{ \sum_{i=1}^n \sum_{j=1}^{n_i} \left(Y_{ij} - \overline Y_{i\bullet}\right)^2 }_\text{(sum of squares due to pure error)}
+ \underbrace{ \sum_{i=1}^n n_i \left( \overline Y_{i\bullet} - \widehat Y_i \right)^2. }_\text{(sum of squares due to lack of fit)}
\end{align}

Read more about this topic:  Lack-of-fit Sum Of Squares

Famous quotes containing the words mathematical and/or details:

    The circumstances of human society are too complicated to be submitted to the rigour of mathematical calculation.
    Marquis De Custine (1790–1857)

    Then he told the news media
    the strange details of his death
    and they hammered him up in the marketplace
    and sold him and sold him and sold him.
    My death the same.
    Anne Sexton (1928–1974)