Long-tail Traffic - The Heavy-tail Distribution

The Heavy-tail Distribution

Heavy-tail distributions have properties that are qualitatively different from commonly used (memoryless) distributions such as the Poisson distribution.

The Hurst parameter H is a measure of the level of self-similarity of a time series that exhibits long-range dependence, to which the heavy-tail distribution can be applied. H takes on values from 0.5 to 1. A value of 0.5 indicates the data is uncorrelated or has only short-range correlations. The closer H is to 1, the greater the degree of persistence or long-range dependence .

Typical values of the Hurst parameter, H:

  • Any pure random process has H = 0.5
  • Phenomena with H > 0.5 typically have a complex process structure.

A distribution is said to be heavy-tailed if:


P \sim x^{- \alpha},\ \text{as} \ x \to \infty, 0< \alpha <2

This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed. The simplest heavy-tail distribution is the Pareto distribution which is hyperbolic over its entire range. Complementary distribution functions for the exponential and Pareto distributions are shown below. Shown on the left is a graph of the distributions shown on linear axes, spanning a large domain . To its right is a graph of the complementary distribution functions over a smaller domain, and with a logarithmic range .

If the logarithm of the range of an exponential distribution is taken, the resulting plot is linear. In contrast, that of the heavy-tail distribution is still curvilinear. These characteristics can be clearly seen on the graph above to the right. A characteristic of long-tail distributions is that if the logarithm of both the range and the domain is taken, the tail of the long-tail distribution is approximately linear over many orders of magnitude . In the graph above left, the condition for the existence of a heavy-tail distribution, as previously presented, is not met by the curve labelled "Gamma-Exponential Tail".

The probability mass function of a heavy-tail distribution is given by:


p(x)= \alpha k^{\alpha} x^{- \alpha -1},\ \alpha ,k>0,\ x \ge k

and its cumulative distribution function is given by:


F(x)=P=1- (\frac{k}{x})^{\alpha}

where k represents the smallest value the random variable can take.

Readers interested in a more rigorous mathematical treatment of the subject are referred to the external links section.

Read more about this topic:  Long-tail Traffic

Famous quotes containing the word distribution:

    The man who pretends that the distribution of income in this country reflects the distribution of ability or character is an ignoramus. The man who says that it could by any possible political device be made to do so is an unpractical visionary. But the man who says that it ought to do so is something worse than an ignoramous and more disastrous than a visionary: he is, in the profoundest Scriptural sense of the word, a fool.
    George Bernard Shaw (1856–1950)