Azul Plugin Lookback
The AzulPluginLookbackSearch and AzulPluginLookbackHash plugins handle content under single-byte substitution ciphers.
Utilising a look back distance technique to match byte repetition patterns, a configurable list of crib strings can be searched for.
Additionally, the same technique can be used to derive a normalised version of the content that should produce the same hashes under different substitution ciphers.
Installation
The plugin relies on the lookback-search package being available in the
current environment or in an accessible pypi server.
It can be installed using pip:
pip install azul-lookback
Usage: azul-lookback-search
This plugin uses the lookback-search package to search for byte patterns under
arbitrary substitution ciphers.
When such patterns are detected both the plaintext pattern, and the cipher data are featured. Both features are labelled with the category the pattern came from (e.g. 'generic_pe') as specified in the configuration files.
Usage on local files:
azul-lookback-search obfuscated.bin
Example Output:
----- LookbackSearch results -----
OK
Output features:
lookback_pattern: generic_pe - b'This program cannot be run in DOS mode.'
generic_pe - b'This program cannot be run in DOS mode.'
lookback_match_data: generic_pe - b'\xaa\xbe\xbf\xc9v\xc6\xc8\xc5\xbd\xc8\xb7\xc3v\xb9\xb7\xc4\xc4\xc5\xcav\xb8\xbbv\xc8\xcb\xc4v\xbf\xc4v\x9a\xa5\xa9v\xc3\xc5\xba\xbb\x84'
generic_pe - b'\xcd\xf1\xf0\xea\xb9\xe9\xeb\xf6\xfe\xeb\xf8\xf4\xb9\xfa\xf8\xf7\xf7\xf6\xed\xb9\xfb\xfc\xb9\xeb\xec\xf7\xb9\xf0\xf7\xb9\xdd\xd6\xca\xb9\xf4\xf6\xfd\xfc\xb7'
Feature key:
lookback_pattern: Plaintext pattern detected underneath obfuscation
lookback_match_data: Obfuscated data which matched on the pattern
Automated usage in system:
azul-lookback-search --server http://azul-dispatcher.localnet/
Usage: azul-lookback-hash
This plugin uses the lookback-search package to generate a "symbol independent"
SHA256 and ssdeep digest. These digests are both unaffected by the application
of substitution ciphers.
Examples:
The following data is written to a file named plaintext.bin:
"This is some data we can obfuscate in a few different ways to generate test files."
Run the plugin on this plaintext:
$azul-lookback-hash plaintext.bin
Output features: lookback_hash: e2ca05df25cd5125c78b0cc17226e1a3155deeae1b041a49bf39a09f22e59c77 lookback_ssdeep: 3:wlgljlSc7oqliMxc5g8lo6QshsmCo5l:QgvR6yqsTo5l
Generate a new file by generating a random substitution cipher and applying it to the plaintext data.
sbox = list(range(256))
random.shuffle(sbox)
obfuscated = bytes([sbox[c] for c in plaintext])
Run the plugin on this obfuscated data:
$ azul-lookback-hash random_sbox.bin
Output features: lookback_hash: e2ca05df25cd5125c78b0cc17226e1a3155deeae1b041a49bf39a09f22e59c77 lookback_ssdeep: 3:wlgljlSc7oqliMxc5g8lo6QshsmCo5l:QgvR6yqsTo5l
The plugin generates the same hash features for both the plaintext and obfuscated data, even though their binary content is wildly different.
ssdeep
By using fuzzy hashing, data can be linked with some level of similarity underneath different substitution ciphers. Here are the regular ssdeep hashes for three Poison Ivy samples from VirusTotal, which clearly shows the similarity between the samples:
192:PJGc1Zl2+VAfNxl1THs6xgzgVGjPlR6nQAh9bNOJW8h:PJGcMJxDTHfRmM9bQh 192:zJGc1Zl2+VAfNxl1THs6xgzgVGjPlRhnQAlKhFo22XGZi:zJGcMJxDTHfRmhWw 192:OJGc1Zl2+VAfNxl1THs6xgzgVGjPlRkTnQAx:OJGcMJxDTHfRmap
Below are the regular ssdeep hashes after each sample is obfuscated with a random substitution cipher. Each ssdeep hash is very different from the others:
192:LcPvfaKqS36rErFE+8gVqDC1r19YKW+47:AnSdSuErFE+fADCF19p47 96:TsdcVP/KBO/q+br81PucGQ4ahkKY87m1iJx14u8U7hjF5KKoF2pC7wPxl0:T2cVnKMy6rij54Ik+zxx8GhjWgsac 96:ecNVsyskZ1T7GogVHqGOe4RUuZZkLaUGKzzN2UsxFOIQ2i/ZAUFgOZc+btjQEI:hNAgTKHYfoarKzqxTQjZ6OKElE
By comparison, below are the lookback_ssdeep features generated by this plugin when run over the obfuscated samples. The first two ssdeep features are clearly similar. Additionally, the tail portion of the third hash (beginning "jdSerhj3Z...") is similar to the start of the other two hashes:
192:ISerhj3ZKeBNaTbnNwH1y7BCirWraiNC4fbeqSd45Elos:mhFKe30p+1y7BCirE84vSdWZs 192:jESerhj3ZKeBNaTbnNwH1y7BCirXr+a2RsB7Rf/NzeQNvYo8:jChFKe30p+1y7BCirKX21Rf/NyQxC 96:K27xSxsrhiCH4YcEKeRZAmNzJoe+pSEnR31dkWtG16d7BVqVVIrx/XayJkm:jdSerhj3ZKeBNaTbnNwH1y7BCirBayqm
The benefit of having a "symbol independent" ssdeep hash is offset slightly by a reduction in similarity scores when comparing the resulting ssdeep hashes.
Comparing the regular ssdeep hashes of the original Poison Ivy samples produced similarity scores of:
77, 82 and 80
Comparing the lookback ssdeep hashes of the obfuscated samples produced similarity scores of:
66, 71 and 69
But it still beats the scores produced by the regular ssdeep hashes of the obfuscated samples:
0, 0 and 0
Python Package management
This python package is managed using a setup.py and pyproject.toml file.
Standardisation of installing and testing the python package is handled through tox. Tox commands include:
# Run all standard tox actions
tox
# Run linting only
tox -e style
# Run tests only
tox -e test
Dependency management
Dependencies are managed in the requirements.txt, requirements_test.txt and debian.txt file.
The requirements files are the python package dependencies for normal use and specific ones for tests (e.g pytest, black, flake8 are test only dependencies).
The debian.txt file manages the debian dependencies that need to be installed on development systems and docker images.
Sometimes the debian.txt file is insufficient and in this case the Dockerfile may need to be modified directly to install complex dependencies.