• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to secondary sidebar
OpenTechTips

OpenTechTips

Comprehensive IT Guides for Pros and Enthusiasts

MENUMENU
  • HOME
  • ALL TOPICS
    • Exchange
    • InfoSec
    • Linux
    • Networking
    • Scripting
      • PowerShell
    • SSL
    • Tools
    • Virtualization
    • Web
    • Windows
  • ABOUT
  • SUBSCRIBE
Home » Base64 Encoding Explained with Examples

Base64 Encoding Explained with Examples

May 27, 2020 - by Zsolt Agoston - last edited on May 31, 2020

Have you ever been wondering why SSL certificates have a strange code in their body, that seemingly only consists of letters, numbers, the "+" and the "/" characters? If you've ever checked the actual content of a saved email, the embedded pictures are represented with a similar code in the body.

Base64 Encoding Explained with Examples

This code is called base64 and simply put it is meant to convert binary files to text format. But what does that mean, what is a binary file, why does it need to be converted to plain text, and how does it work?

A little background

Computers work with numbers. This is how they are designed, the binary 0 and 1 can represent any number which they can store, calculate with, transmit, etc. But what about letters? In the early 1960s The American Standard Code for Information Interchange (ASCII) decided to map a number to every letter, creating a standard that all computer makers can follow.

They decided to use a whole byte to represent a character. A byte consists of 8 bits. One bit can represent two values: either 0 or 1. Two bytes combined can represent twice as much: 00, 01, 10, 11. In decimal that means 4 different values. Following that logic, 8 bits - a byte - will be able to take 256 values, this way the ASCII table has 256 characters to work with.

In this ASCII table each character - letters, numbers or punctuation marks - are all represented with a number and this is how the computer actually works with them. For instance, the decimal number 65 represents the letter "A" for the computer.

Base64 Encoding Explained with Examples

Source: https://theasciicode.com.ar/

Binary vs text files

Having a look on the ASCII table above, we notice that there are a few numbers that represents special characters, like 0, the null character, 10 which is the line feed or 13 that is the carriage return. To the computer these characters - among others - have special meaning, they are only used when running codes, they should never be displayed. In fact, if we try to print them on the monitor, we usually get gibberish and errors back, as the computer think they are code and it tries to execute them which leads to errors.

Base64 Encoding Explained with Examples

This is the point where we need to talk about the difference between binary and text files. Binary files can contain all these special characters, they can use all the 256 characters in the ASCII table. Text files only contain printable characters.

Base64 characters

They recognized early that many cases it might be necessary to transmit binary data in simple text format. SSL certificates are perfect examples, they contain a lot of information about the domain, SAN, algorithms used, the public key, serial number, issues, etc, crammed in one X.509 certificate file. However, those files are binary files. Do distribute them easier, we'd need them in a plain text format. This is where Base64 comes into the picture. The Base64 algorithm dissects the original data and encodes them using only printable characters. As the name suggests, 64 characters are allocated for this purpose: the 26 uppercase (A-Z) English letters, 26x lowercase (a-z) letters, the 10x digits (0..9) and the "+" and "/" characters.

They could have gone with any other characters, they chose these two. Also, at the end of a  base64 encoded content you might see one or "=" characters, they are for padding only. More on that later.

Base64 mapping

Base64 Encoding Explained with Examples

8bit to 6bit

Now let's see how the encoding process work. As discussed earlier, we have the original text that consist of a chain of regular characters that are represented by their 8bit ASCII numbers. We use the string "hello!" as an example.

The letters that you see are stored as the following ASCII numbers in memory.

Base64 Encoding Explained with Examples

However, in our Base64 table we have only 64 numbers, that only occupies 6 bits (26 = 64) instead of 8bits (28 = 256)

To convert 8bit characters to 6bits first we need to find the least common multiple to 6 and 8, which is 24.

That simply means the smallest ASCII block we can convert to Base64 will be 24bits long, that translates to 3 ASCII characters. As a result, we'll get a 4 character long Base64 output.

Encode

Our example is just perfect as it contains 6 characters, which is a multiple of 3.

First, he encoder divides the original string into two pieces to create 24 bit large blocks.

Base64 Encoding Explained with Examples

The decimal ASCII values of "hel" are 104, 101 and 108. In binary it is 01101000, 01100101, 01101100.

Then the decoder divides this 24 bit block into 4 pieces to get 6 bit chunks: 011010, 000110, 010101, 101100. In decimal, they correspond to number 26, 6, 21 and 44.

Base64 Encoding Explained with Examples

The last step is mapping those numbers using the Base64 table where 0 maps to A, 1 to B, etc.

We get the end result of: aGVs

Padding

Our encoder is almost complete, the last thing we need to mention is what happens if the length of the string that we want to encode is not the multiple of 3? That means that the last block won't be 3 characters long. With only two or one characters we cannot run the encoding algorithm as it works with exactly 24bits at a time. This is where padding comes in the picture.

If the last block is one character long, the encoder adds two arbitrary characters to the end of the original string. If the last block is two characters long, only one padding character is needed. I like to use the null char: \x00, but the padding can be anything as it will be discarded at the end of the decoding process anyway). Then our program encodes the block and also adds two "=" marks to the end of the Base64 encoded result, this way the decoder will know how many characters to discard from the end of the decoded string to get the original string back.

As a quick example, see the illustration below. When we encode the string "hello!y", the encoder needs to add two padding characters to support the encoding process. Notice, that "hello!you" and the shorter "hello!y" strings have the same encoded length. This is because the encoder encodes the "hello!yAA" string (containing the padding), then the decoder will know from the two "==" signs do discard the two "AA" padding characters from the end of the string.

Base64 Encoding Explained with Examples

Base64 decoding

Decoding is simply the reverse of the encoding process.  

Base64 Encoding Explained with Examples

The decoder checks the Base64 table to get the number value of the Base64 characters back. "a" becomes 26, "G" is 6 after mapping. Four of these form 24bits, that will be divided into three 8bit pieces. Those are the actual ASCII values of the original characters, after decoding all, if there is padding that is discarded from the end and we're done!

Encoders / Decoders

Python Base64 encoder and decoder - detailed guide

import sys

def base64encode(s):

  i = 0
  base64 = ending = ''
  base64chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
  
  # Add padding if string is not dividable by 3
  pad = len(s) % 3
  if pad != 0:
    while pad < 3:
      s += "A"
      ending += '='
      pad += 1
  
  # Iterate though the whole input string
  while i < len(s):
    b = 0

    # Take 3 characters at a time, convert them to 4 base64 chars
    for j in range(0,3,1):
      
      # get ASCII code of the next character in line
      n = ord(s[i])
      i += 1
  
      # Concatenate the three characters together 
      b += n << 8 * (2-j)
    
    # Convert the 3 chars to four Base64 chars
    base64 += base64chars[ (b >> 18) & 63 ]
    base64 += base64chars[ (b >> 12) & 63 ]
    base64 += base64chars[ (b >> 6) & 63 ]
    base64 += base64chars[ b & 63 ]

  # Add the actual padding to the end
  base64 += ending
  
  # Print the Base64 encoded result
  print (base64)

base64encode(sys.argv[1])

Decoder

import sys

def base64decode(s):
  i = 0
  base64 = decoded = ''
  base64chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
  
  # Remove padding and save the number to be excluded from decoded
  if s[-2:] == '==':
    s = s[0:-2]
    padd = 2
  elif s[-1:] == '=':
    s = s[0:-1]
    padd = 1
  else:
    padd = 0
  
  # Take 4 characters at a time 
  while i < len(s):
    d = 0
    for j in range(0,4,1):
      
      d += base64chars.index( s[i] ) << (18 - j * 6)
      i += 1
    
    # Convert the 4 chars back to ASCII
    decoded += chr( (d >> 16 ) & 255 )
    decoded += chr( (d >> 8 ) & 255 )
    decoded += chr( d & 255 )
  
  # Remove padding
  decoded = decoded[0:len( decoded ) - padd]
  
  # Print the Base64 encoded result
  print (decoded)

base64decode(sys.argv[1])

 PowerShell Base64 encoder and decoder - detailed guide

function Base64Encode($s) {
$i = 0
$base64 = $ending = ''
$base64chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
  
# Add padding if string is not dividable by 3
$pad = $s.length % 3
if ($pad -ne 0) {
		$s += "A" * (3 - $pad)
		$ending = "=" * (3 - $pad)
	}

# Iterate though the whole input string
while ($i -lt $s.length) {
	# Take 3 characters at a time, convert them to 4 base64 chars 
	$b = 0
	for ($j=0; $j -lt 3; $j++) {
	
		# get ASCII code of the next character in line
		$ascii = [int][char]$s[$i]
		$i++
		
		# Concatenate the three characters together 
		$b += $ascii -shl 8 * (2-$j)
		}
	
	# Convert the 3 chars to four Base64 chars
	$base64 += $base64chars[ ($b -shr 18) -band 63 ]
	$base64 += $base64chars[ ($b -shr 12) -band 63 ]
	$base64 += $base64chars[ ($b -shr 6) -band 63 ]
	$base64 += $base64chars[ $b -band 63 ]
	}

# Add the actual padding to the end
$base64 += $ending

# Print the Base64 encoded result
Write-Host $base64
}

Decoder

function Base64Decode($s) {
$i = 0
$base64 = $decoded = ''
$base64chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'

# Remove padding and save the number to be excluded from decoded
if ($s.substring($s.length - 2,2) -like "==") {
	$s = $s.substring(0, $s.length - 2)
	$padd = 2
	}
elseif ($s.substring($s.length - 1,1) -like "=") {
	$s = $s.substring(0, $s.length - 1)
	$padd = 1
}

# Take 4 characters at a time
while ($i -lt $s.length) {
	$d = 0
	
	for ($j=0; $j -lt 4; $j++) {
		$d += $base64chars.indexof($s[$i]) -shl (18 - $j * 6)
		$i++
		}

	# Convert the 4 chars back to ASCII
	$decoded += [char](($d -shr 16) -band 255)
	$decoded += [char](($d -shr 8) -band 255)
	$decoded += [char]($d -band 255)
}

# Remove padding
$decoded = $decoded.substring(0, $decoded.length - $padd)

# Print the Base64 encoded result
Write-Host $decoded
}

Reader Interactions

Comments Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Tools

Secondary Sidebar

CONTENTS

  • A little background
  • Binary vs text files
  • Base64 characters
  • 8bit to 6bit
  • Encode
  • Padding
  • Base64 decoding
  • Encoders / Decoders

  • Terms of Use
  • Disclaimer
  • Privacy Policy
Manage your privacy

To provide the best experiences, we and our partners use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us and our partners to process personal data such as browsing behavior or unique IDs on this site and show (non-) personalized ads. Not consenting or withdrawing consent, may adversely affect certain features and functions.

Click below to consent to the above or make granular choices. Your choices will be applied to this site only. You can change your settings at any time, including withdrawing your consent, by using the toggles on the Cookie Policy, or by clicking on the manage consent button at the bottom of the screen.

Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Statistics

Marketing

Features
Always active

Always active
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
Manage options
{title} {title} {title}
Manage your privacy
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Statistics

Marketing

Features
Always active

Always active
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
Manage options
{title} {title} {title}