Shift JIS - Description

Description

Shift JIS is based on character sets defined within JIS standards JIS X 0201:1997 (for the single-byte characters) and JIS X 0208:1997 (for the double byte characters). The lead bytes for the double byte characters are "shifted" around the 64 halfwidth katakana characters in the single-byte range 0xA1 to 0xDF. The single-byte characters 0x00 to 0x7F match the ASCII encoding, except for a yen sign (U+00A5) at 0x5C and an overline (U+203E) at 0x7E in place of the ASCII character set's backslash and tilde respectively. The single-byte characters from 0xA1 to 0xDF map to the half-width katakana characters found in JIS X 0201. Shift JIS can be and is used for HTML since the important start and end of HTML tags and fields, <, >, /, " appear as themselves only, not as a part of a two byte sequence.

Shift JIS requires an 8-bit clean medium for transmission. It is fully backwards compatible with the legacy JIS X 0201 single-byte encoding, meaning it supports half-width katakana and that any valid JIS X 0201 string is also a valid Shift JIS string. For two-byte characters, however, Shift JIS only guarantees that the first byte will be high bit set (0x80–0xFF); the value of the second byte can be either high or low. Appearance of byte values 0x40–0x7E as second bytes of code words makes reliable Shift JIS detection difficult, because same codes are used for ASCII characters. On the other hand, the competing 8-bit format EUC-JP, which does not support single-byte halfwidth katakana, allows for a much cleaner and direct conversion to and from JIS X 0208 code points, as all high bit set bytes are parts of a double-byte character and all codes from ASCII range represent single-byte characters. The same thing is valid for UTF-8 which is a world standard, better supported by software, and is predicted to fully replace Shift-JIS and EUC-JP.

For a double-byte JIS sequence, the transformation to the corresponding Shift JIS bytes is:

Read more about this topic:  Shift JIS

Famous quotes containing the word description:

    As they are not seen on their way down the streams, it is thought by fishermen that they never return, but waste away and die, clinging to rocks and stumps of trees for an indefinite period; a tragic feature in the scenery of the river bottoms worthy to be remembered with Shakespeare’s description of the sea-floor.
    Henry David Thoreau (1817–1862)

    The next Augustan age will dawn on the other side of the Atlantic. There will, perhaps, be a Thucydides at Boston, a Xenophon at New York, and, in time, a Virgil at Mexico, and a Newton at Peru. At last, some curious traveller from Lima will visit England and give a description of the ruins of St. Paul’s, like the editions of Balbec and Palmyra.
    Horace Walpole (1717–1797)

    I was here first introduced to Joe.... He was a good-looking Indian, twenty-four years old, apparently of unmixed blood, short and stout, with a broad face and reddish complexion, and eyes, methinks, narrower and more turned up at the outer corners than ours, answering to the description of his race. Besides his underclothing, he wore a red flannel shirt, woolen pants, and a black Kossuth hat, the ordinary dress of the lumberman, and, to a considerable extent, of the Penobscot Indian.
    Henry David Thoreau (1817–1862)