Base64 is a binary-to-text encoding scheme that represents binary data in ASCII format. It's commonly used when there's a need to encode binary data for storage or transfer in environments that only reliably support text content.
Common uses include encoding email attachments, embedding image data in HTML or CSS, and transferring complex data in API calls.
Base64 was first introduced in the early days of email systems, particularly with the MIME (Multipurpose Internet Mail Extensions) standard in 1992. The need arose because early email protocols could only reliably transmit 7-bit ASCII characters, while binary files like images or executables required all 8 bits. Base64 solved this problem by encoding 8-bit data into a subset of ASCII characters.
Base64 encoding transforms binary data by dividing it into 6-bit chunks, each representing one of 64 possible values (0-63). These values are then represented by ASCII characters (A-Z, a-z, 0-9, + and /). The equals sign (=) is used for padding at the end if needed.
The encoding process takes 3 bytes (24 bits) of binary data at a time and converts them into 4 Base64 characters. If the input data length is not divisible by 3, padding characters are added to ensure the output length is a multiple of 4 characters.
The standard Base64 alphabet uses the following 64 characters:
Several variants of Base64 exist for specific use cases. URL-safe Base64 replaces + and / with - and _ to avoid issues with URL encoding. Other variants like Base64URL, Radix-64, and others exist for specific applications like PEM encryption and OpenPGP.
For example, to encode the ASCII text "Man":
If the number of bytes to encode is not divisible by 3, padding is used to maintain proper alignment. The padding character "=" has no data significance; it simply indicates that fewer than 24 bits are encoded in the final Base64 block. One "=" means that the final block encodes just 16 bits, while "==" means it encodes just 8 bits.
Base64 encoding increases the data size by approximately 33% compared to the original binary data (specifically, the output is 4/3 times the size of the input). This overhead is the trade-off for ensuring data integrity across text-only systems.
...
)While Base64 is useful, it has limitations: