Part one - security, performance, obfuscation, and compression
This series of blog posts will look at a range of techniques commonly used to avoid detection by antivirus products and the ready availability of these tools.
Security through obscurity
The latter is normally achieved by shortening function and variable names and removing whitespace. While there are many examples out there, a quick look at the source code for the Google home page – a chunk of which is shown below – reveals (a) the complexity behind the apparent simplicity; and (b) just how much of an impact this can have on the legibility of a piece of code.
Malicious obfuscation, on the other hand, will typically go further and encrypt the underlying source code of the script to be decrypted on the fly when the script is run. These techniques occasionally produce some unusual looking results, such as in the example shown below.
(For the curious, the alphabet used within the ‘ST’ string is the Unified Canadian Aboriginal Syllabics Unicode character set.)
Figure 2: Obfuscated block within VB script and deobfuscation code
As .NET applications are converted to an intermediary language and are then usually compiled just-in-time at runtime their code is exposed to many of the same risks (from the author’s standpoint) as interpreted scripting languages: .NET executables can be decompiled to a close approximation of the original C#/VB code relatively easily.
At this point it should be stressed that not all obfuscation – even the more advanced encryption usually seen in malicious samples – is bad. Forcepoint Security Labs are aware of at least one anti-fraud product embedded within several major banking websites which protects itself using a combination of both methods.
The primary type of tool used to obfuscate and ‘protect’ compiled binaries are packers. Traditional packers are effectively self-extracting archives – or at least they work in broadly analogous terms. Along with the compressed/obfuscated data (the original binary in obfuscated form) they contain a deobfuscator ‘stub’ which, upon execution, deobfuscates the binary and jumps to its restored entry point.
An important commonality shared by this category is that the original binary is completely recovered upon deobfuscation and available for further reverse engineering. Thus, this type of packer is easily defeated by a well-placed breakpoint in a debugger just after deobfuscation is complete when execution control is transferred to the original binary.
One of the most basic packers in use today is the open-source UPX developed in the 1990s, which uses an extremely simple compression algorithm not designed for obfuscation. Unmodified UPX packed binaries are trivial to unpack with the UPX command line tool and such binaries are often automatically unpacked by most AV products. In order to avoid automatic unpacking, malware authors sometimes modify the packer.
Despite its simplicity, UPX-packed notepad.exe still doesn’t tell us a lot at first glance when loaded with IDA:
Figure 3: The notepad.exe application packed with UPX
Another commonly used example is ASPack, which is more focused on obfuscation and security as opposed to compression. Some version of ASPack use self-modifying code which makes using breakpoints more difficult, but at the end of the day the same principles apply to unpacking it – it begins by pushing all registers on the stack…
Figure 4: ASPack pushing all registers onto the stack before unpacking the original binary
… and finishes by restoring all registers and jumping to the original entry point, breaking just before control is transferred to the original binary:
Figure 5: ASPack popping the previously stored registers before returning control to the unpacked binary
Another category of packer – or rather obfuscator, as this type increases the file size instead of reducing it – is virtualisation-based obfuscators. These work by destroying the original binary and creating a new functionally equivalent binary using custom bytecode which is executed on a custom obfuscated interpreter. The main takeaway is that the original binary is never restored (in contrast with the previous type of obfuscators) and it remains obfuscated throughout its execution. Obvious Significant drawbacks are the drastically increased file size and slower execution speed.
One notable example of this category is VMProtect. Manual deobfuscation of VMProtect binaries is often very difficult: one has to decode every single bytecode for every protected binary as they are randomly generated when a binary is obfuscated and therefore don’t remain constant across obfuscated binaries.
Nevertheless, with perseverance it is possible to decode each bytecode instruction as the stack based interpreter is fairly straightforward (despite junk instructions used to frustrate reverse engineering attempts). ESI, which now contains the deobfuscated address of the bytecode, is used as the instruction pointer and decremented after each bytecode instruction (see below; note all the junk instructions).
Figure 6: Screenshot showing the bytecode generated by VMProtect, use of decrementing ESI register, and junk instructions
Pictured below is the VM handler executing the bytecode (using push ret in this instance). The interpreter is highly polymorphic and changes between obfuscated binaries.
Figure 7: VMProtect VM handler
VMProtect like many other more advanced types of packer (whether VM-based or not) employs a number of anti-VM and anti-debugger methods to make unpacking even more difficult.
Interestingly, VMProtect seems to be not very popular with malware authors as of mid-2017: Forcepoint Security Labs developed a generic method of detecting VMProtect packed binaries and found that the samples identified were not typically malware, with the majority being made up of adware and cheating software for games.