Heavy-tailed distribution

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.

There are two important subclasses of heavy-tailed distributions, the long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.

There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power moments finite; and some others to those distributions that do not have a variance. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as log-normal that possess all their power moments, yet which are generally acknowledged to be heavy-tailed.

Definition of Heavy-tailed Distribution
The distribution of a random variable X with distribution function $$F$$ is said to have a heavy right tail if



\lim_{x \to \infty} e^{\lambda x}\Pr[X>x] = \infty \quad \mbox{for all } \lambda>0.\, $$

This is also written in terms of the tail distribution function $$\overline{F}(x) \equiv \Pr(X>x)$$ as



\lim_{x \to \infty} e^{\lambda x}\overline{F}(x) = \infty \quad \mbox{for all } \lambda>0.\, $$

This is equivalent to the statement that the moment generating function of $$F$$, $$ M_F(t) $$, is infinite for all $$t>0$$.

The definitions of heavy-tailed for left-tailed or two tailed distributions are similar.

Definition of Long-tailed Distribution
The distribution of a random variable X with distribution function $$F$$ is said to have a long right tail if for all $$ t \in \mathbb{R} $$



\lim_{x \to \infty} \Pr[X>x+t|X>x) =1, $$

or equivalently



\overline{F}(x+t) \sim \overline{F}(x) \quad \mbox{as } x \to \infty. $$

This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level: if you know the situation is bad, it is probably worse than you think.

All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

Subexponential Distributions
Subexponentiality is defined in terms of convolutions of probability distributions. For two independent, identically distributed random variables $$ X_1,X_2$$ with common distribution function $$F$$ the convolution of $$F$$ with itself, $$F^{*2}$$ is defined by:



\Pr(X_1+X_2 \leq x) = F^{*2}(x) = \int_{- \infty}^{\infty} F(x-y)F(dy). $$

The n-fold convolution $$F^{*n}$$ is defined in the same way.

A distribution $$F$$ is subexponential if



\overline{F^{*2}}(x) \sim 2\overline{F}(x) \quad \mbox{as } x \to \infty. $$

This implies that, for any $$n \geq 1$$,



\overline{F^{*n}}(x) \sim n\overline{F}(x) \quad \mbox{as } x \to \infty. $$

The probabilistic interpretation of this is that for a sum of $$n$$ independent random variables $$X_1,\ldots,X_n$$



\Pr(X_1+ \cdots X_n>x) \sim \Pr(\max(X_1, \ldots,X_n)>x) \quad \mbox{as } x \to \infty. $$

This is often known as the principle of the single big jump.

All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

Common Heavy-tailed Distributions
All commonly used heavy-tailed distributions are subexponential.

Those that are one-tailed include:
 * the Pareto distribution;
 * the Log-normal distribution;
 * the Weibull distribution;
 * the Burr distribution;
 * the Log-gamma distribution.

Those that are two-tailed include:
 * The Cauchy distribution, itself a special case of
 * the t-distribution;
 * all of the Stable Distribution family, excepting the special case of the normal distribution within that family. Stable distributions may be symmetric or not.