Venti - Hash Collisions

Hash Collisions

A basic principle of information theory is the pigeonhole principle, which states that if set A contains more values than set B, then for any function that maps A to B there will be members of B that are associated with more than one member of set A. In the case of Venti, the set of possible SHA-1 hashes is obviously smaller than the set of all possible blocks that could be stored in the filesystem, and thus a hash collision is possible.

The risk of accidental hash collision in a 160-bit hash is very small, even for exabytes of data. Historically, however, many hash functions become increasingly vulnerable to malicious hash collisions due to both cryptographic and computational advances. Venti does not address the issue of hash collisions; as of this time, it is still computationally infeasible to find collisions in SHA-1, but it may become necessary for Venti to switch to a different hash function at some point in the future.

Read more about this topic:  Venti