Jeffreys prior

In Bayesian probability, the Jeffreys prior (called after Harold Jeffreys) is a non-informative prior distribution proportional to the square root of the Fisher information:


 * $$p(\theta) \propto \sqrt{I(\theta | y)}$$

and is invariant under reparameterization of $$\theta$$.

It's an important noninformative (objective) prior.

It allows us to describe our knowledge on $$ \phi $$, a transformation of $$ \theta $$ with an improper uniform distribution. This also implies the resulting likelihood function, $$ L(\phi|X) $$ should be asymptotically translated by changes in data (Due to asymtotical normality, this means only the first moment will vary when data is updated).

It can be derived as follows:

We need an injective transformation of $$\theta $$ such that our prior under this transformation is uniform. It gives us "no information". We then use the following relation:


 * $$ I(\phi | y) = (\frac{d\theta}{d\phi})^2I(\theta | y)$$

To conclude,


 * $$ \frac{d\phi}{d\theta} \propto \sqrt{I(\theta | y)}$$


 * $$ \phi \propto \int_ {}\sqrt{I(\theta | y)} \ d\theta $$

From a practical and mathematical standpoint, a valid reason to use this noninformative prior instead of others, like the ones obtained through a limit in conjugate families of distributions, is that it best represents the lack of knowledge when a certain parametrical family is chosen, and it is linked with strong bayesian statistics results.