Comparison of Unicode Encodings - Compatibility Issues

Compatibility Issues

A UTF-8 file that contains only ASCII characters is identical to an ASCII file. Legacy programs can generally handle UTF-8 encoded files, even if they contain non-ASCII characters. For instance, the C print function can print a UTF-8 format string, as it only looks for the byte matching the ASCII '%' character, and prints all other bytes unchanged, thus any UTF-8 (which never contains a '%' byte) will be copied unchanged to the output.

UTF-16 and UTF-32 are incompatible with ASCII files, and thus require Unicode-aware programs to display, print and manipulate them, even if the file is known to contain only characters in the ASCII subset. Because they contain many zero bytes, the strings cannot be manipulated by normal null-terminated string handling for even simple operations such as copy.

Therefore, even most UTF-16 systems such as Windows and Java store text files, such as program code, with 8-bit encodings (ASCII, ISO-8859-1, or UTF-8), not UTF-16. One of the few counterexamples of a UTF-16 file is the "strings" file used by Mac OS X (10.3 and later) applications for lookup of internationalized versions of messages, these default to UTF-16 and "files encoded using UTF-8 are not guaranteed to work. When in doubt, encode the file using UTF-16". This is because the default string class in Mac OS X (NSString) stores characters in UTF-16.

XML is, by default, encoded as UTF-8, and all XML processors must at least support UTF-8 (including US-ASCII by definition) and UTF-16.

Read more about this topic:  Comparison Of Unicode Encodings

Famous quotes containing the word issues:

    The current flows fast and furious. It issues in a spate of words from the loudspeakers and the politicians. Every day they tell us that we are a free people fighting to defend freedom. That is the current that has whirled the young airman up into the sky and keeps him circulating there among the clouds. Down here, with a roof to cover us and a gasmask handy, it is our business to puncture gasbags and discover the seeds of truth.
    Virginia Woolf (1882–1941)