Byte Pair Encoding Example
Suppose we wanted to encode the data
aaabdaaabacThe byte pair "aa" occurs most often, so it will be replaced by a byte that is not used in the data, "Z". Now we have the following data and replacement table:
ZabdZabac Z=aaThen we repeat the process with byte pair "ab", replacing it with Y:
ZYdZYac Y=ab Z=aaWe could stop here, as the only literal byte pair left occurs only once. Or we could continue the process and use recursive byte pair encoding, replacing "ZY" with "X":
XdXac X=ZY Y=ab Z=aaThis data cannot be compressed further by byte pair encoding because there are no pairs of bytes that occur more than once.
To decompress the data, simply perform the replacements in the reverse order.
Read more about this topic: Byte Pair Encoding
Famous quotes containing the word pair:
“Auld Noah was at hame wi them a,
The lion and the lamb,
Pair by pair they entered the Ark
And he took them as they cam.”
—Hugh MacDiarmid (18921978)