Bias of An Estimator - Bayesian View

Bayesian View

Most Bayesians are rather unconcerned about unbiasedness (at least in the formal sampling-theory sense above) of their estimates. For example, Gelman et al (1995) write: "From a Bayesian perspective, the principle of unbiasedness is reasonable in the limit of large samples, but otherwise it is potentially misleading."

Fundamentally, the difference between the Bayesian approach and the sampling-theory approach above is that in the sampling-theory approach the parameter is taken as fixed, and then probability distributions of a statistic are considered, based on the predicted sampling distribution of the data. For a Bayesian, however, it is the data which is known, and fixed, and it is the unknown parameter for which an attempt is made to construct a probability distribution, using Bayes' theorem:

Here the second term, the likelihood of the data given the unknown parameter value θ, depends just on the data obtained and the modelling of the data generation process. However a Bayesian calculation also includes the first term, the prior probability for θ, which takes account of everything the analyst may know or suspect about θ before the data comes in. This information plays no part in the sampling-theory approach; indeed any attempt to include it would be considered "bias" away from what was pointed to purely by the data. To the extent that Bayesian calculations include prior information, it is therefore essentially inevitable that their results will not be "unbiased" in sampling theory terms.

But the results of a Bayesian approach can differ from the sampling theory approach even if the Bayesian tries to adopt an "uninformative" prior.

For example, consider again the estimation of an unknown population variance σ2 of a Normal distribution with unknown mean, where it is desired to optimise c in the expected loss function

A standard choice of uninformative prior for this problem is the Jeffreys prior, which is equivalent to adopting a rescaling-invariant flat prior for ln σ2.

One consequence of adopting this prior is that S2/σ2 remains a pivotal quantity, i.e. the probability distribution of S2/σ2 depends only on S2/σ2, independent of the value of S2 or σ2:

However, whilst

in contrast

— when the expectation is taken over the probability distribution of σ2 given S2, as it is in the Bayesian case, rather than S2 given σ2, one can no longer take σ4 as a constant and factor it out. The consequence of this is that, compared to the sampling-theory calculation, the Bayesian calculation puts more weight on larger values of σ2, properly taking into account (as the sampling-theory calculation cannot) that under this squared-loss function the consequence of underestimating large values of σ2 is more costly in squared-loss terms than that of overestimating small values of σ2.

The worked-out Bayesian calculation gives a scaled inverse chi-squared distribution with n − 1 degrees of freedom for the posterior probability distribution of σ2. The expected loss is minimised when cnS2 = <σ2>; this occurs when c = 1/(n − 3).

Even with an uninformative prior, therefore, a Bayesian calculation may not give the same expected-loss minimising result as the corresponding sampling-theory calculation.

Read more about this topic:  Bias Of An Estimator

Famous quotes containing the word view:

    The gentlemen [at a ball], as they passed and repassed, looked as if they thought we were quite at their disposal, and only waiting for the honour of their commands; and they sauntered about, in a careless indolent manner, as if with a view to keep us in suspense.... I thought it so provoking, that I determined in my own mind that, far from humouring such airs, I would rather not dance at all.
    Frances Burney (1752–1840)