Padding
The '==
' sequence indicates that the last group contained only 1 byte, and '=
' indicates that it contained 2 bytes. The example below illustrates how truncating the input of the whole of the above quote changes the output padding:
The same characters will be encoded differently depending on their position within the three-octet group which is encoded to produce the four characters. For example
The Input: pleasure. Encodes to: cGxlYXN1cmUu The Input: leasure. Encodes to: bGVhc3VyZS4= The Input: easure. Encodes to: ZWFzdXJlLg== The Input: asure. Encodes to: YXN1cmUu The Input: sure. Encodes to: c3VyZS4=The number of output bytes per input byte is approximately 4 / 3 (33% overhead) and converges to that value for a large number of bytes. More specifically, given an input of n bytes, the output will be bytes long, including padding characters.
From a theoretical point of view, the padding character is not needed, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations, the padding character is mandatory, while for others it is not used. One case where padding characters are required is when multiple Base64 encoded files are concatenated. The 2011 DEF-CON Capture the Flag (CTF) qualifiers contained a puzzle with a file of concatenated Base64 encoded files.