C0 and C1 Control Codes - Encoding Interoperability

Encoding Interoperability

While the C1 control characters are used in conjunction with the ISO/IEC 8859 series of graphical character sets among others, and are integrated into Unicode, they are rarely used directly, except on specific platforms such as OpenVMS. When they turn up in documents, Web pages, e-mail messages, etc., which are ostensibly in an ISO-8859-n encoding, their code positions generally refer instead to the characters at that position in a proprietary, system-specific encoding such as Windows-1252 or the Apple Macintosh ("MacRoman") character set that use the codes provided for representation of the C1 set with a single 8-bit byte to instead provide additional graphic characters, though this is technically invalid under the ISO encodings. The C1 characters in Unicode require 2 bytes to be encoded in UTF-8 (for instance CSI at U+0098 is encoded as the bytes 0xC2, 0x9B in UTF-8). Thus the corresponding control functions are more commonly accessed using the equivalent two byte escape sequence intended for use with systems that have only 7-bit bytes.

Read more about this topic:  C0 And C1 Control Codes