Prefix Code - Techniques

Techniques

If every word in the code has the same length, the code is called a fixed-length code, or a block code (though the term block code is also used for fixed-size error-correcting codes in channel coding). For example, ISO 8859-15 letters are always 8 bits long. UTF-32/UCS-4 letters are always 32 bits long. ATM packets are always 424 bits long. A block code of fixed length k bits can encode up to source symbols.

Prefixes cannot exist in a fixed-length code without padding fixed codes to the shorter prefixes in order to meet the length of the longest prefixes (however such padding codes may be selected to introduce redundancy that allows autocorrection and/or synchronisation). However, fixed length encodings are inefficient in situations where some words are much more likely to be transmitted than others (in which case some or all of the redundancy may be eliminated for data compression).

Truncated binary encoding is a straightforward generalization of block codes to deal with cases where the number of symbols n is not a power of two. Source symbols are assigned codewords of length k and k+1. where .

Huffman coding is a more sophisticated technique for constructing variable-length prefix codes. The Huffman coding algorithm takes as input the frequencies that the code words should have, and constructs a prefix code that minimizes the weighted average of the code word lengths. This is a form of lossless data compression based on entropy encoding.

Some codes mark the end of a code word with a special "comma" symbol, different from normal data. This is somewhat analogous to the spaces between words in a sentence; they mark where one word ends and another begins. If every code word ends in a comma, and the comma does not appear elsewhere in a code word, the code is prefix-free. However, modern communication systems send everything as sequences of "1" and "0" – adding a third symbol would be expensive, and using it only at the ends of words would be inefficient. Morse code is an everyday example of a variable-length code with a comma. The long pauses between letters, and the even longer pauses between words, help people recognize where one letter (or word) ends, and the next begins. Similarly, Fibonacci coding uses a "11" to mark the end of every code word.

Self-synchronizing codes are prefix codes that allow frame synchronization.

Read more about this topic:  Prefix Code

Famous quotes containing the word techniques:

    It is easy to lose confidence in our natural ability to raise children. The true techniques for raising children are simple: Be with them, play with them, talk to them. You are not squandering their time no matter what the latest child development books say about “purposeful play” and “cognitive learning skills.”
    Neil Kurshan (20th century)

    The techniques of opening conversation are universal. I knew long ago and rediscovered that the best way to attract attention, help, and conversation is to be lost. A man who seeing his mother starving to death on a path kicks her in the stomach to clear the way, will cheerfully devote several hours of his time giving wrong directions to a total stranger who claims to be lost.
    John Steinbeck (1902–1968)