Character Encoding - Character Encoding Translation

Character Encoding Translation

As a result of having many character encoding methods in use (and the need for backward compatibility with archived data), many computer programs have been developed to translate data between encoding schemes. Some of these are cited below.

Cross-platform:

  • Web browsers – most modern web browsers feature automatic character encoding detection. On Firefox 3, for example, see the View/Character Encoding submenu.
  • iconv – program and standardized API to convert encodings
  • convert_encoding.py – Python based utility to convert text files between arbitrary encodings and line endings.
  • decodeh.py – algorithm and module to heuristically guess the encoding of a string.
  • International Components for Unicode – A set of C and Java libraries to perform charset conversion. uconv can be used from ICU4C.
  • chardet – This is a translation of the Mozilla automatic-encoding-detection code into the Python computer language.
  • The newer versions of the unix File command attempt to do a basic detection of character encoding. (also available on cygwin and mac)

Linux:

  • cmv - simple tool for transcoding filenames.
  • convmv – convert a filename from one encoding to another.
  • cstocs – convert file contents from one encoding to another
  • enca – analyzes encodings for given text files.
  • recode – convert file contents from one encoding to another
  • utrac – convert file contents from one encoding to another.

Windows:

  • Encoding.Convert – .NET API
  • MultiByteToWideChar/WideCharToMultiByte – Convert from ANSI to Unicode & Unicode to ANSI
  • cscvt – character set conversion tool
  • enca – analyzes encodings for given text files.

Read more about this topic:  Character Encoding

Famous quotes containing the words character and/or translation:

    The image cannot be dispossessed of a primordial freshness, which idea can never claim. An idea is derivative and tamed. The image is in the natural or wild state, and it has to be discovered there, not put there, obeying its own law and none of ours. We think we can lay hold of image and take it captive, but the docile captive is not the real image but only the idea, which is the image with its character beaten out of it.
    John Crowe Ransom (1888–1974)

    Translation is the paradigm, the exemplar of all writing.... It is translation that demonstrates most vividly the yearning for transformation that underlies every act involving speech, that supremely human gift.
    Harry Mathews (b. 1930)