In Bayesian statistics, a **hyperprior** is a prior distribution on a hyperparameter, that is, on a parameter of a prior distribution.

As with the term *hyperparameter,* the use of *hyper* is to distinguish it from a prior distribution of a parameter of the model for the underlying system. They arise particularly in the use of conjugate priors.

For example, if one is using a beta distribution to model the distribution of the parameter *p* of a Bernoulli distribution, then:

- The Bernoulli distribution (with parameter
*p*) is the*model*of the underlying system; *p*is a*parameter*of the underlying system (Bernoulli distribution);- The beta distribution (with parameters
*α*and*β*) is the*prior*distribution of*p*; *α*and*β*are parameters of the prior distribution (beta distribution), hence*hyperparameters;*- A prior distribution of
*α*and*β*is thus a*hyperprior.*

In principle, one can iterate the above: if the hyperprior itself has hyperparameters, these may be called hyperhyperparameters, and so forth.

One can analogously call the posterior distribution on the hyperparameter the hyperposterior, and, if these are in the same family, call them conjugate hyperdistributions or a conjugate hyperprior. However, this rapidly becomes very abstract and removed from the original problem.

Read more about Hyperprior: Purpose