Birthday Attack - Mathematics

Mathematics

Given a function, the goal of the attack is to find two different inputs such that . Such a pair is called a collision. The method used to find a collision is simply to evaluate the function for different input values that may be chosen randomly or pseudorandomly until the same result is found more than once. Because of the birthday problem, this method can be rather efficient. Specifically, if a function yields any of different outputs with equal probability and is sufficiently large, then we expect to obtain a pair of different arguments and with after evaluating the function for about different arguments on average.

We consider the following experiment. From a set of H values we choose n values uniformly at random thereby allowing repetitions. Let p(n; H) be the probability that during this experiment at least one value is chosen more than once. This probability can be approximated as

Let n(p; H) be the smallest number of values we have to choose, such that the probability for finding a collision is at least p. By inverting this expression above, we find the following approximation

and assigning a 0.5 probability of collision we arrive at

Let Q(H) be the expected number of values we have to choose before finding the first collision. This number can be approximated by

As an example, if a 64-bit hash is used, there are approximately 1.8 × 1019 different outputs. If these are all equally probable (the best case), then it would take 'only' approximately 5.1 × 109 attempts to generate a collision using brute force. This value is called birthday bound and for n-bit codes it could be computed as 2n/2. Other examples are as follows:

Bits Possible outputs
(rounded)(H)
Desired probability of random collision
(rounded) (p)
10−18 10−15 10−12 10−9 10−6 0.1% 1% 25% 50% 75%
16 6.6 × 104 2 2 2 2 2 11 36 1.9 × 102 3.0 × 102 4.3 × 102
32 4.3 × 109 2 2 2 2.9 93 2.9 × 103 9.3 × 103 5.0 × 104 7.7 × 104 1.1 × 105
64 1.8 × 1019 6.1 1.9 × 102 6.1 × 103 1.9 × 105 6.1 × 106 1.9 × 108 6.1 × 108 3.3 × 109 5.1 × 109 7.2 × 109
128 3.4 × 1038 2.6 × 1010 8.2 × 1011 2.6 × 1013 8.2 × 1014 2.6 × 1016 8.3 × 1017 2.6 × 1018 1.4 × 1019 2.2 × 1019 3.1 × 1019
256 1.2 × 1077 4.8 × 1029 1.5 × 1031 4.8 × 1032 1.5 × 1034 4.8 × 1035 1.5 × 1037 4.8 × 1037 2.6 × 1038 4.0 × 1038 5.7 × 1038
384 3.9 × 10115 8.9 × 1048 2.8 × 1050 8.9 × 1051 2.8 × 1053 8.9 × 1054 2.8 × 1056 8.9 × 1056 4.8 × 1057 7.4 × 1057 1.0 × 1058
512 1.3 × 10154 1.6 × 1068 5.2 × 1069 1.6 × 1071 5.2 × 1072 1.6 × 1074 5.2 × 1075 1.6 × 1076 8.8 × 1076 1.4 × 1077 1.9 × 1077
Table shows number of hashes n(p) needed to achieve the given probability of success, assuming all hashes are equally likely. For comparison, 10−18 to 10−15 is the uncorrectable bit error rate of a typical hard disk . In theory, MD5 hashes or UUIDs, being 128 bits, should stay within that range until about 820 billion documents, even if its possible outputs are many more.

It is easy to see that if the outputs of the function are distributed unevenly, then a collision can be found even faster. The notion of 'balance' of a hash function quantifies the resistance of the function to birthday attacks and allows the vulnerability of popular hashes such as MD and SHA to be estimated (Bellare and Kohno, 2004).

The subexpression in the equation for is not computed accurately for small when directly translated into common programming languages as log(1/(1-p)) due to loss of significance. When log1p is available (as it is in ANSI C), the equivalent expression -log1p(-p) should be used instead. When this is not done, the first column of the above table is computed as zero, and several items in the second column do not have even one correct significant digit.

Read more about this topic:  Birthday Attack

Famous quotes containing the word mathematics:

    Mathematics alone make us feel the limits of our intelligence. For we can always suppose in the case of an experiment that it is inexplicable because we don’t happen to have all the data. In mathematics we have all the data ... and yet we don’t understand. We always come back to the contemplation of our human wretchedness. What force is in relation to our will, the impenetrable opacity of mathematics is in relation to our intelligence.
    Simone Weil (1909–1943)

    ... though mathematics may teach a man how to build a bridge, it is what the Scotch Universities call the humanities, that teach him to be civil and sweet-tempered.
    Amelia E. Barr (1831–1919)

    Why does man freeze to death trying to reach the North Pole? Why does man drive himself to suffer the steam and heat of the Amazon? Why does he stagger his mind with the mathematics of the sky? Once the question mark has arisen in the human brain the answer must be found, if it takes a hundred years. A thousand years.
    Walter Reisch (1903–1963)