IDN Homograph Attack

Prehistory

An early nuisance of this kind, pre-dating the Internet and even text terminals, was the confusion between "l" (lowercase letter "L") / "1" (the number "one") and "O" (capital letter for vowel "o") / "0" (the number "zero"). Some typewriters in the pre-computer era even conflated the ell and the one; users had to type a lowercase L when the number one was needed. The zero/oh confusion gave rise to the tradition of crossing zeros, so that a computer operator would type them correctly. Unicode may contribute to this greatly with its combining characters, accents, several types of hyphen-alikes, etc., often due to inadequate rendering support, especially with smaller fonts sizes and wide variety of fonts.

Even earlier, handwriting provided rich opportunities for confusion. A notable example is the etymology of the word "zenith". The translation from the Arabic "samt" included the scribe's confusing of "m" into "ni". This was common in medieval blackletter, which did not connect the vertical columns on the letters i, m, n, or u, making them difficult to distinguish when several were in a row. The latter, as well as "rn"/"m"/"rri" ("RN"/"M"/"RRI") confusion, is still possible for a human eye even with modern advanced computer technology.

Intentional look-alike character substitution with different alphabets has also been known in various contexts. For example, Faux Cyrillic has been used as an amusement or attention-grabber and "Volapuk encoding" was used in early days of the Internet as a way to overcome the lack of support for the Cyrillic alphabet.

Read more about this topic: IDN Homograph Attack

IDN Homograph Attack - Prehistory