Fitting The Regression Line
Suppose there are n data points {yi, xi}, where i = 1, 2, …, n. The goal is to find the equation of the straight line
which would provide a "best" fit for the data points. Here the "best" will be understood as in the least-squares approach: such a line that minimizes the sum of squared residuals of the linear regression model. In other words, numbers α and β solve the following minimization problem:
By using either calculus, the geometry of inner product spaces or simply expanding to get a quadratic in α and β, it can be shown that the values of α and β that minimize the objective function Q are
where rxy is the sample correlation coefficient between x and y, sx is the standard deviation of x, and sy is correspondingly the standard deviation of y. Horizontal bar over a variable means the sample average of that variable. For example:
Substituting the above expressions for and into
yields
This shows the role plays in the regression line of standardized data points.
Read more about this topic: Simple Linear Regression
Famous quotes containing the words fitting and/or line:
“Greater glory in the sun,
An evening chill upon the air,
Bid imagination run
Much on the Great Questioner;
What He can question, what if questioned I
Can with a fitting confidence reply.”
—William Butler Yeats (18651939)
“Today, the notion of progress in a single line without goal or limit seems perhaps the most parochial notion of a very parochial century.”
—Lewis Mumford (18951990)
