Estimation of Covariance Matrices - Estimation in A General Context

Estimation in A General Context

Given a sample consisting of n independent observations x1,..., xn of a p-dimensional random vector XRp×1 (a p×1 column-vector), an unbiased estimator of the (p×p) covariance matrix

is the sample covariance matrix

where is the i-th observation of the p-dimensional random vector, and

is the sample mean. This is true regardless of the distribution of the random variable X, provided of course that the theoretical means and covariances exist. The reason for the factor n − 1 rather than n is essentially the same as the reason for the same factor appearing in unbiased estimates of sample variances and sample covariances, which relates to the fact that the mean is not known and is replaced by the sample mean.

In cases where the distribution of the random variable X is known to be within a certain family of distributions, other estimates may be derived on the basis of that assumption. A well-known instance is when the random variable X is normally distributed: in this case the maximum likelihood estimator of the covariance matrix is slightly different from the unbiased estimate, and is given by

A derivation of this result is given below. Clearly, the difference between the unbiased estimator and the maximum likelihood estimator diminishes for large n.

In the general case, the unbiased estimate of the covariance matrix provides an acceptable estimate when the data vectors in the observed data set are all complete: that is they contain no missing elements. One approach to estimating the covariance matrix is to treat the estimation of each variance or pairwise covariance separately, and to use all the observations for which both variables have valid values. Assuming the missing data are missing at random this results in an estimate for the covariance matrix which is unbiased. However, for many applications this estimate may not be acceptable because the estimated covariance matrix is not guaranteed to be positive semi-definite. This could lead to estimated correlations having absolute values which are greater than one, and/or a non-invertible covariance matrix.

Read more about this topic:  Estimation Of Covariance Matrices

Famous quotes containing the words estimation, general and/or context:

    No man ever stood lower in my estimation for having a patch in his clothes; yet I am sure that there is greater anxiety, commonly, to have fashionable, or at least clean and unpatched clothes, than to have a sound conscience.
    Henry David Thoreau (1817–1862)

    Every general increase of freedom is accompanied by some degeneracy, attributable to the same causes as the freedom.
    Charles Horton Cooley (1864–1929)

    The hippie is the scion of surplus value. The dropout can only claim sanctity in a society which offers something to be dropped out of—career, ambition, conspicuous consumption. The effects of hippie sanctimony can only be felt in the context of others who plunder his lifestyle for what they find good or profitable, a process known as rip-off by the hippie, who will not see how savagely he has pillaged intricate and demanding civilizations for his own parodic lifestyle.
    Germaine Greer (b. 1939)