Prior distributions may be estimated within the model via hyperprior distributions, which are usually vague and nearly flat. Parameters of hyperprior distributions are called hyperparameters. Using hyperprior distributions to estimate prior distributions is known as hierarchical Bayes. In theory, this process could continue further, using hyperhyperprior distributions to estimate the hyperprior distributions. Estimating priors through hyperpriors, and from the data, is a method to elicit the optimal prior distributions. One of many natural uses for hierarchical Bayes is multilevel modeling.
Recall that the unnormalized joint posterior distribution is proportional to the likelihood times the prior distribution
The simplest hierarchical Bayes model takes the form
where Φ is a set of hyperprior distributions. By reading the equation from right to left, it begins with hyperpriors Φ, which are used conditionally to estimate priors p(Θ|Φ), which in turn is used, as per usual, to estimate the likelihood p(y|Θ), and finally the posterior is p(Θ,Φ|y).