Universal Character Set

The Universal Character Set (UCS), defined by the International Standard ISO/IEC 10646, Information technology — Universal multiple-octet coded character set (UCS) (plus amendments to that standard), is a standard set of characters upon which many character encodings are based. The UCS contains nearly one hundred thousand abstract characters, each identified by an unambiguous name and an integer number called its code point.

Characters (letters, numbers, symbols, ideograms, logograms, etc.) from the many languages, scripts, and traditions of the world are represented in the UCS with unique code points. The inclusiveness of the UCS is continually improving as characters from previously unrepresented writing systems are added.

Since 1991, the Unicode Consortium has worked with ISO to develop The Unicode Standard ("Unicode") and ISO/IEC 10646 in tandem. The repertoire, character names, and code points of Version 2.0 of Unicode exactly match those of ISO/IEC 10646-1:1993 with its first seven published amendments. After the publication of Unicode 3.0 in February 2000, corresponding new and updated characters entered the UCS via ISO/IEC 10646-1:2000. In 2003, parts 1 and 2 of ISO/IEC 10646 were combined into a single part, which has since had a number of amendments adding characters to the standard in approximate synchrony with the Unicode standard.

The UCS has over 1.1 million code points available for use, but only the first 65,536 (the Basic Multilingual Plane, or BMP) had entered into common use before 2000. This situation began changing when the People's Republic of China (PRC) ruled in 2000 that all software sold in its jurisdiction would have to support GB 18030. This required software intended for sale in the PRC to move beyond the BMP.

The system deliberately leaves many code points not assigned to characters, even in the BMP. It does this to allow for future expansion or to minimize conflicts with other encoding forms.

Read more about Universal Character Set:  Encoding Forms of The Universal Character Set, History of ISO 10646, Differences Between ISO 10646 and Unicode, Citing The Universal Character Set, Correlation To Unicode

Famous quotes containing the words universal, character and/or set:

    The almost universal bareness and smoothness of the landscape were as agreeable as novel, making it so much more like the deck of a vessel.
    Henry David Thoreau (1817–1862)

    There is no character, howsoever good and fine, but it can be destroyed by ridicule, howsoever poor and witless. Observe the ass, for instance: his character is about perfect, he is the choicest spirit among all the humbler animals, yet see what ridicule has brought him to. Instead of feeling complimented when we are called an ass, we are left in doubt.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)

    The host, the housekeeper, it is
    who fails you. He had forgotten
    to make room for you at the hearth
    or set a place for you at the table
    or leave the doors unlocked for you.
    Denise Levertov (b. 1923)