Simple Linear Regression - Numerical Example

Numerical Example

This example concerns the data set from the Ordinary least squares article. This data set gives average weights for humans as a function of their height in the population of American women of age 30–39. Although the OLS article argues that it would be more appropriate to run a quadratic regression for this data, the simple linear regression model is applied here instead.

xi 1.47 1.50 1.52 1.55 1.57 1.60 1.63 1.65 1.68 1.70 1.73 1.75 1.78 1.80 1.83 Height (m)
yi 52.21 53.12 54.48 55.84 57.20 58.57 59.93 61.29 63.11 64.47 66.28 68.10 69.92 72.19 74.46 Mass (kg)

There are n = 15 points in this data set. Hand calculations would be started by finding the following five sums:

\begin{align} & S_x = \sum x_i = 24.76,\quad S_y = \sum y_i = 931.17 \\ & S_{xx} = \sum x_i^2 = 41.0532, \quad S_{xy} = \sum x_iy_i = 1548.2453, \quad S_{yy} = \sum y_i^2 = 58498.5439 \end{align}

These quantities would be used to calculate the estimates of the regression coefficients, and their standard errors.

\begin{align} & \hat\beta = \frac{nS_{xy}-S_xS_y}{nS_{xx}-S_x^2} = 61.272 \\ & \hat\alpha = \tfrac{1}{n}S_y - \hat\beta \tfrac{1}{n}S_x = -39.062 \\ & s_\varepsilon^2 = \tfrac{1}{n(n-2)} \big( nS_{yy}-S_y^2 - \hat\beta^2(nS_{xx}-S_x^2) \big) = 0.5762 \\ & s_\beta^2 = \frac{n s_\varepsilon^2}{nS_{xx} - S_x^2} = 3.1539 \\ & s_\alpha^2 = s_\beta^2 \tfrac{1}{n} S_{xx} = 8.63185 \end{align}

The 0.975 quantile of Student's t-distribution with 13 degrees of freedom is t*13 = 2.1604, and thus confidence intervals for α and β are

\begin{align} & \alpha \in = \\ & \beta \in = \end{align}

The product-moment correlation coefficient might also be calculated:

 \hat{r} = \frac{nS_{xy} - S_xS_y}{\sqrt{(nS_{xx}-S_x^2)(nS_{yy}-S_y^2)}} = 0.9945

This example also demonstrates that sophisticated calculations will not overcome the use of badly prepared data. The heights were originally given in inches, and have been converted to the nearest centimetre. Since the conversion factor is one inch to 2.54 cm, this is not a correct conversion. The original inches can be recovered by Round(x/0.0254) and then re-converted to metric: if this is done, the results become

\begin{align} & \hat\beta = 61.6746 \\ & \hat\alpha = -39.7468 \\ \end{align}

Thus a seemingly small variation in the data has a real effect.

Read more about this topic:  Simple Linear Regression

Famous quotes containing the word numerical:

    There is a genius of a nation, which is not to be found in the numerical citizens, but which characterizes the society.
    Ralph Waldo Emerson (1803–1882)

    The terrible tabulation of the French statists brings every piece of whim and humor to be reducible also to exact numerical ratios. If one man in twenty thousand, or in thirty thousand, eats shoes, or marries his grandmother, then, in every twenty thousand, or thirty thousand, is found one man who eats shoes, or marries his grandmother.
    Ralph Waldo Emerson (1803–1882)