Unicode - Origin and Development

Origin and Development

Unicode has the explicit aim of transcending the limitations of traditional character encodings, such as those defined by the ISO 8859 standard, which find wide usage in various countries of the world, but remain largely incompatible with each other. Many traditional character encodings share a common problem in that they allow bilingual computer processing (usually using Latin characters and the local script), but not multilingual computer processing (computer processing of arbitrary scripts mixed with each other).

Unicode, in intent, encodes the underlying characters—graphemes and grapheme-like units—rather than the variant glyphs (renderings) for such characters. In the case of Chinese characters, this sometimes leads to controversies over distinguishing the underlying character from its variant glyphs (see Han unification).

In text processing, Unicode takes the role of providing a unique code point—a number, not a glyph—for each character. In other words, Unicode represents a character in an abstract way and leaves the visual rendering (size, shape, font, or style) to other software, such as a web browser or word processor. This simple aim becomes complicated, however, because of concessions made by Unicode's designers in the hope of encouraging a more rapid adoption of Unicode.

The first 256 code points were made identical to the content of ISO-8859-1 so as to make it trivial to convert existing western text. Many essentially identical characters were encoded multiple times at different code points to preserve distinctions used by legacy encodings and therefore, allow conversion from those encodings to Unicode (and back) without losing any information. For example, the "fullwidth forms" section of code points encompasses a full Latin alphabet that is separate from the main Latin alphabet section. In Chinese, Japanese, and Korean (CJK) fonts, these characters are rendered at the same width as CJK ideographs, rather than at half the width. For other examples, see Duplicate characters in Unicode.

Read more about this topic:  Unicode

Famous quotes containing the words origin and/or development:

    In the woods in a winter afternoon one will see as readily the origin of the stained glass window, with which Gothic cathedrals are adorned, in the colors of the western sky seen through the bare and crossing branches of the forest.
    Ralph Waldo Emerson (1803–1882)

    Such condition of suspended judgment indeed, in its more genial development and under felicitous culture, is but the expectation, the receptivity, of the faithful scholar, determined not to foreclose what is still a question—the “philosophic temper,” in short, for which a survival of query will be still the salt of truth, even in the most absolutely ascertained knowledge.
    Walter Pater (1839–1894)