Long-tail Traffic - The Heavy-tail Distribution

The Heavy-tail Distribution

Heavy-tail distributions have properties that are qualitatively different from commonly used (memoryless) distributions such as the Poisson distribution.

The Hurst parameter H is a measure of the level of self-similarity of a time series that exhibits long-range dependence, to which the heavy-tail distribution can be applied. H takes on values from 0.5 to 1. A value of 0.5 indicates the data is uncorrelated or has only short-range correlations. The closer H is to 1, the greater the degree of persistence or long-range dependence .

Typical values of the Hurst parameter, H:

  • Any pure random process has H = 0.5
  • Phenomena with H > 0.5 typically have a complex process structure.

A distribution is said to be heavy-tailed if:


P \sim x^{- \alpha},\ \text{as} \ x \to \infty, 0< \alpha <2

This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed. The simplest heavy-tail distribution is the Pareto distribution which is hyperbolic over its entire range. Complementary distribution functions for the exponential and Pareto distributions are shown below. Shown on the left is a graph of the distributions shown on linear axes, spanning a large domain . To its right is a graph of the complementary distribution functions over a smaller domain, and with a logarithmic range .

If the logarithm of the range of an exponential distribution is taken, the resulting plot is linear. In contrast, that of the heavy-tail distribution is still curvilinear. These characteristics can be clearly seen on the graph above to the right. A characteristic of long-tail distributions is that if the logarithm of both the range and the domain is taken, the tail of the long-tail distribution is approximately linear over many orders of magnitude . In the graph above left, the condition for the existence of a heavy-tail distribution, as previously presented, is not met by the curve labelled "Gamma-Exponential Tail".

The probability mass function of a heavy-tail distribution is given by:


p(x)= \alpha k^{\alpha} x^{- \alpha -1},\ \alpha ,k>0,\ x \ge k

and its cumulative distribution function is given by:


F(x)=P=1- (\frac{k}{x})^{\alpha}

where k represents the smallest value the random variable can take.

Readers interested in a more rigorous mathematical treatment of the subject are referred to the external links section.

Read more about this topic:  Long-tail Traffic

Famous quotes containing the word distribution:

    My topic for Army reunions ... this summer: How to prepare for war in time of peace. Not by fortifications, by navies, or by standing armies. But by policies which will add to the happiness and the comfort of all our people and which will tend to the distribution of intelligence [and] wealth equally among all. Our strength is a contented and intelligent community.
    Rutherford Birchard Hayes (1822–1893)