Pearson Product-moment Correlation Coefficient - Removing Correlation

Removing Correlation

It is always possible to remove the correlation between random variables with a linear transformation, even if the relationship between the variables is nonlinear. A presentation of this result for population distributions is given by Cox & Hinkley.

A corresponding result exists for sample correlations, in which the sample correlation is reduced to zero. Suppose a vector of n random variables is sampled m times. Let X be a matrix where is the jth variable of sample i. Let be an m by m square matrix with every element 1. Then D is the data transformed so every random variable has zero mean, and T is the data transformed so all variables have zero mean and zero correlation with all other variables - the moment matrix of T will be the identity matrix. This has to be further divided by the standard deviation to get unit variance. The transformed variables will be uncorrelated, even though they may not be independent.


where an exponent of -1/2 represents the matrix square root of the inverse of a matrix. The covariance matrix of T will be the identity matrix. If a new data sample x is a row vector of n elements, then the same transform can be applied to x to get the transformed vectors d and t:


This decorrelation is related to Principal Components Analysis for multivariate data.

Read more about this topic:  Pearson Product-moment Correlation Coefficient