Regression Dilution

Regression dilution is a statistical phenomenon also known as "attenuation".

Consider fitting a straight line for the relationship of an outcome variable y to a predictor variable x, and estimating the gradient (slope) of the line. Statistical variability, measurement error or random noise in the y variable cause imprecision in the estimated gradient, but not bias: on average, the procedure calculates the right gradient. However, variability, measurement error or random noise in the x variable causes bias in the estimated gradient (as well as imprecision). The greater the variance in the x measurement, the closer the estimated slope must approach 0 instead of the true gradient. This 'dilution' of the gradient towards 0 is referred to as "regression dilution," "attenuation," or "attenuation bias."

It may seem counter-intuitive that noise in the predictor variable x induces a bias, but noise in the outcome variable y does not. Recall that linear regression is not symmetric: the line of best fit for predicting y from x (the usual linear regression) is not the same as the line of best fit for predicting x from y (see, for example, Draper & Smith, "Applied Regression Analysis"; page 5 of the 1966 edition).

Read more about Regression Dilution:  Is Correction Necessary?, Further Reading