Pearson Product-moment Correlation Coefficient - Pearson's Correlation and Least Squares Regression Analysis

Pearson's Correlation and Least Squares Regression Analysis

The square of the sample correlation coefficient, which is also known as the coefficient of determination, estimates the fraction of the variance in Y that is explained by X in a simple linear regression. As a starting point, the total variation in the Yi around their average value can be decomposed as follows


\sum_i (Y_i - \bar{Y})^2 = \sum_i (Y_i-\hat{Y}_i)^2 + \sum_i (\hat{Y}_i-\bar{Y})^2,

where the are the fitted values from the regression analysis. This can be rearranged to give


1 = \frac{\sum_i (Y_i-\hat{Y}_i)^2}{\sum_i (Y_i - \bar{Y})^2} + \frac{\sum_i (\hat{Y}_i-\bar{Y})^2}{\sum_i (Y_i - \bar{Y})^2}.

The two summands above are the fraction of variance in Y that is explained by X (right) and that is unexplained by X (left).

Next, we apply a property of least square regression models, that the sample covariance between and is zero. Thus, the sample correlation coefficient between the observed and fitted response values in the regression can be written


\begin{align}
r(Y,\hat{Y}) &= \frac{\sum_i(Y_i-\bar{Y})(\hat{Y}_i-\bar{Y})}{\sqrt{\sum_i(Y_i-\bar{Y})^2\cdot \sum_i(\hat{Y}_i-\bar{Y})^2}}\\
&= \frac{\sum_i(Y_i-\hat{Y}_i+\hat{Y}_i-\bar{Y})(\hat{Y}_i-\bar{Y})}{\sqrt{\sum_i(Y_i-\bar{Y})^2\cdot \sum_i(\hat{Y}_i-\bar{Y})^2}}\\
&= \frac{ \sum_i }{\sqrt{\sum_i(Y_i-\bar{Y})^2\cdot \sum_i(\hat{Y}_i-\bar{Y})^2}}\\
&= \frac{ \sum_i (\hat{Y}_i-\bar{Y})^2 }{\sqrt{\sum_i(Y_i-\bar{Y})^2\cdot \sum_i(\hat{Y}_i-\bar{Y})^2}}\\
&= \sqrt{\frac{\sum_i(\hat{Y}_i-\bar{Y})^2}{\sum_i(Y_i-\bar{Y})^2}}.
\end{align}

Thus


r(Y,\hat{Y})^2 = \frac{\sum_i(\hat{Y}_i-\bar{Y})^2}{\sum_i(Y_i-\bar{Y})^2}

is the proportion of variance in Y explained by a linear function of X.

Read more about this topic:  Pearson Product-moment Correlation Coefficient

Famous quotes containing the words pearson, squares and/or analysis:

    ...we shall never be the people we should and might be until we have learned that it is the first and most important business of a nation to protect its women, not by any puling sentimentality of queenship, chivalry or angelhood, but by making it possible for them to earn an honest living.
    —Katharine Pearson Woods (1853–1923)

    An afternoon of nurses and rumours;
    The provinces of his body revolted,
    The squares of his mind were empty,
    Silence invaded the suburbs,
    —W.H. (Wystan Hugh)

    Ask anyone committed to Marxist analysis how many angels on the head of a pin, and you will be asked in return to never mind the angels, tell me who controls the production of pins.
    Joan Didion (b. 1934)