Deterministic Encryption

A deterministic encryption scheme (as opposed to a probabilistic encryption scheme) is a cryptosystem which always produces the same ciphertext for a given plaintext and key, even over separate executions of the encryption algorithm. Examples of deterministic encryption algorithms include the RSA cryptosystem (without encryption padding), and many block ciphers when used in ECB mode or with a constant initialization vector.

Deterministic encryption can leak information to an eavesdropper, who may recognize known ciphertexts. For example, when an adversary learns that a given ciphertext corresponds to some interesting message, she learns something every time that ciphertext is transmitted. To gain information about the meaning of various ciphertexts, an adversary might perform a statistical analysis of messages transmitted over an encrypted channel, or attempt to correlate ciphertexts with observed actions (e.g., noting that a given ciphertext is always received immediately before a submarine dives). This concern is particularly serious in the case of public key cryptography, where any party can encrypt chosen messages using a public encryption key. In this case, the adversary can build a large "dictionary" of useful plaintext/ciphertext pairs, then observe the encrypted channel for matching ciphertexts.

To counter this problem, cryptographers proposed the notion of "randomized" or probabilistic encryption. Under these schemes, a given plaintext can encrypt to one of a very large set of possible ciphertexts, chosen randomly during the encryption process. Under sufficiently strong security guarantees the attacks proposed above become infeasible, as the adversary will be unable to correlate any two encryptions of the same message, or correlate a message to its ciphertext, even given access to the public encryption key. This guarantee is known as semantic security or indistinguishability, and has several definitions depending on the assumed capabilities of the attacker (see semantic security).

While deterministic encryption schemes can never be semantically secure, they have some advantages over probabilistic schemes. One primary motivation for the use of deterministic encryption is the efficient searching of encrypted data. Suppose a client wants to outsource a database to a possibly untrusted database service provider. If each entry is encrypted using a public-key cryptosystem, anyone can add to the database, and only the distinguished "receiver" who has the secret key can decrypt the database entries. If, however, the receiver wants to search for a specific record in the database, this becomes very difficult. There are some Public Key encryption schemes that allow keyword search (e.g., ,), however these schemes all require search time linear in the database size. If the database entries were encrypted with a deterministic scheme and sorted, then a specific field of the database could be retrieved in logarithmic time. Assuming that a deterministic encryption scheme is going to be used, it is important to understand what is the maximum level of security that can be guaranteed. A number of recent works have focused on this exact problem. The first work to rigorously define security for a deterministic scheme was in CRYPTO 2007, . This work provided fairly strong security definitions (although weaker than semantic security), and gave constructions in the random oracle model. Two follow-up works appeared the next year in CRYPTO 2008, giving definitional equivalences and constructions without random oracles, .