Eight Bits to Six Bits
Base-64 encoding takes a sequence of 8-bit bytes, breaks the sequence into 6-bit pieces, and assigns each 6-bit piece to one of 64 characters comprising the base-64 alphabet. The 64 possible output characters are common and safe to place in HTTP header fields. The 64 characters include upper- and lowercase letters, numbers, +, and /. The special character = also is used. The base-64 alphabet is shown in Table E-1.
Note that because the base-64 encoding uses 8-bit characters to represent 6 bits of information, base 64-encoded strings are about 33% larger than the original values.
Table E-1. Base-64 alphabet | |||||||||||||||
0 | A | 8 | I | 16 | Q | 24 | Y | 32 | g | 40 | o | 48 | w | 56 | 4 |
1 | B | 9 | J | 17 | R | 25 | Z | 33 | h | 41 | p | 49 | x | 57 | 5 |
2 | C | 10 | K | 18 | S | 26 | a | 34 | i | 42 | q | 50 | y | 58 | 6 |
3 | D | 11 | L | 19 | T | 27 | b | 35 | j | 43 | r | 51 | z | 59 | 7 |
4 | E | 12 | M | 20 | U | 28 | c | 36 | k | 44 | s | 52 | 0 | 60 | 8 |
5 | F | 13 | N | 21 | V | 29 | d | 37 | l | 45 | t | 53 | 1 | 61 | 9 |
6 | G | 14 | O | 22 | W | 30 | e | 38 | m | 46 | u | 54 | 2 | 62 | + |
7 | H | 15 | P | 23 | X | 31 | f | 39 | n | 47 | v | 55 | 3 | 63 | / |
Screenshot E-1 shows a simple example of base-64 encoding. Here, the three-character input value "Ow!" is base 64-encoded, resulting in the four-character base 64-encoded value "T3ch". It works like this:
1. The string "Ow!" is broken into 3 8-bit bytes (0x4F, 0x77, 0x21).
2. The 3 bytes create the 24-bit binary value 010011110111011100100001.
3. These bits are segmented into the 6-bit sequences 010011, 110111, 01110, 100001.
4. Each of these 6-bit values represents a number from 0 to 63, corresponding to one of 64 characters in the base-64 alphabet. The resulting base 64-encoded string is the 4-character string "T3ch", which can then be sent across the wire as "safe" 8-bit characters, because only the most portable characters are used (letters, numbers, etc.).
Screenshot E-1. Base-64 encoding example