Long-tail Traffic - The Heavy-tail Distribution

The Heavy-tail Distribution

Heavy-tail distributions have properties that are qualitatively different from commonly used (memoryless) distributions such as the Poisson distribution.

The Hurst parameter H is a measure of the level of self-similarity of a time series that exhibits long-range dependence, to which the heavy-tail distribution can be applied. H takes on values from 0.5 to 1. A value of 0.5 indicates the data is uncorrelated or has only short-range correlations. The closer H is to 1, the greater the degree of persistence or long-range dependence .

Typical values of the Hurst parameter, H:

Any pure random process has H = 0.5
Phenomena with H > 0.5 typically have a complex process structure.

A distribution is said to be heavy-tailed if:

$P \sim x^{- \alpha},\ \text{as} \ x \to \infty, 0< \alpha <2$

This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed. The simplest heavy-tail distribution is the Pareto distribution which is hyperbolic over its entire range. Complementary distribution functions for the exponential and Pareto distributions are shown below. Shown on the left is a graph of the distributions shown on linear axes, spanning a large domain . To its right is a graph of the complementary distribution functions over a smaller domain, and with a logarithmic range .

If the logarithm of the range of an exponential distribution is taken, the resulting plot is linear. In contrast, that of the heavy-tail distribution is still curvilinear. These characteristics can be clearly seen on the graph above to the right. A characteristic of long-tail distributions is that if the logarithm of both the range and the domain is taken, the tail of the long-tail distribution is approximately linear over many orders of magnitude . In the graph above left, the condition for the existence of a heavy-tail distribution, as previously presented, is not met by the curve labelled "Gamma-Exponential Tail".

The probability mass function of a heavy-tail distribution is given by:

$p(x)= \alpha k^{\alpha} x^{- \alpha -1},\ \alpha ,k>0,\ x \ge k$

and its cumulative distribution function is given by:

$F(x)=P=1- (\frac{k}{x})^{\alpha}$

where k represents the smallest value the random variable can take.

Readers interested in a more rigorous mathematical treatment of the subject are referred to the external links section.

Read more about this topic: Long-tail Traffic

Famous quotes containing the word distribution:

“Classical and romantic: private language of a family quarrel, a dead dispute over the distribution of emphasis between man and nature.”
—Cyril Connolly (1903–1974)