ISO/IEC 2022 - ISO/IEC 2022 Character Sets

ISO/IEC 2022 Character Sets

Character encodings using ISO/IEC 2022 mechanism include:

  • ISO-2022-JP. A widely used encoding for Japanese. Starts in ASCII and includes the following escape sequences
    • ESC ( B to switch to ASCII (1 byte per character)
    • ESC ( J to switch to JIS X 0201-1976 (ISO/IEC 646:JP) Roman set (1 byte per character)
    • ESC $ @ to switch to JIS X 0208-1978 (2 bytes per character)
    • ESC $ B to switch to JIS X 0208-1983 (2 bytes per character)
  • ISO-2022-JP-1. The same as ISO-2022-JP with one additional escape sequence
    • ESC $ ( D to switch to JIS X 0212-1990 (2 bytes per character)
  • ISO-2022-JP-2. A multilingual extension of ISO-2022-JP. The same as ISO-2022-JP-1 with the following additional escape sequences
    • ESC $ A to switch to GB 2312-1980 (2 bytes per character)
    • ESC $ ( C to switch to KS X 1001-1992 (2 bytes per character)
    • ESC . A to switch to ISO/IEC 8859-1 high part, Extended Latin 1 set (1 byte per character)
    • ESC . F to switch to ISO/IEC 8859-7 high part, Basic Greek set (1 byte per character)
  • ISO-2022-JP-3. The same as ISO-2022-JP with three additional escape sequences
    • ESC ( I to switch to JIS X 0201-1976 Kana set (1 byte per character)
    • ESC $ ( O to switch to JIS X 0213-2000 Plane 1 (2 bytes per character)
    • ESC $ ( P to switch to JIS X 0213-2000 Plane 2 (2 bytes per character)
  • ISO-2022-JP-2004. The same as ISO-2022-JP-3 with one additional escape sequence
    • ESC $ ( Q to switch to JIS X 0213-2004 Plane 1 (2 bytes per character)
  • ISO-2022-KR. An encoding for Korean.
    • ESC $ ) C to switch to KS X 1001-1992, previously named KS C 5601-1987 (2 bytes per character)
  • ISO-2022-CN. An encoding for Chinese.
    • ESC $ ) A to switch to GB 2312-1980 (2 bytes per character)
    • ESC $ ) G to switch to CNS 11643-1992 Plane 1 (2 bytes per character)
    • ESC $ * H to switch to CNS 11643-1992 Plane 2 (2 bytes per character)
  • ISO-2022-CN-EXT. The same as ISO-2022-CN with six additional escape sequences
    • ESC $ ) E to switch to ISO-IR-165 (2 bytes per character)
    • ESC $ + I to switch to CNS 11643-1992 Plane 3 (2 bytes per character)
    • ESC $ + J to switch to CNS 11643-1992 Plane 4 (2 bytes per character)
    • ESC $ + K to switch to CNS 11643-1992 Plane 5 (2 bytes per character)
    • ESC $ + L to switch to CNS 11643-1992 Plane 6 (2 bytes per character)
    • ESC $ + M to switch to CNS 11643-1992 Plane 7 (2 bytes per character)

The character after the ESC (for single-byte character sets) or ESC $ (for multi-byte character sets) specifies the type of character set and working set that is designated to. In the above examples, the character ( (0x28) designates a 94-character set to the G0 character set. This may be replaced by ), * or + (0x29–0x2B) to designate to the G1–G3 character sets.

Two of the codes above are 96-character codes, and in the above examples, the character - (0x2D) designates to the G1 character set. This may be replaced with . or / (0x2E or 0x2F) to designate to the G2 or G3 character sets. As mentioned earlier, a 96-character set may not be designated to the G0 set.

There are three special cases for multi-byte codes. The code sequences ESC $ @, ESC $ A, and ESC $ B were all registered before the ISO/IEC 2022 standard was finalized, so must be accepted as synonyms for the sequences ESC $ ( @ through ESC $ ( B to designate to the G0 character set. The latter form may also be used, and may be adapted by changing the ( character to designate to the G1 through G3 character sets.

The standard also defines a way to specify coding systems that do not follow its own structure. Of particular interest, the sequence ESC % G designates the UTF-8 coding system, which does not reserve the range 0x80–0x9F for control characters.

Read more about this topic:  ISO/IEC 2022

Famous quotes containing the words character and/or sets:

    Much of a man’s character will be found betokened in his backbone. I would rather feel your spine than your skull, whoever you are. A thin joist of a spine never yet upheld a full and noble soul.
    Herman Melville (1819–1891)

    A continual feast of commendation is only to be obtained by merit or by wealth: many are therefore obliged to content themselves with single morsels, and recompense the infrequency of their enjoyment by excess and riot, whenever fortune sets the banquet before them.
    Samuel Johnson (1709–1784)