3S Labs Banner

Thursday, June 13, 2013

HiDump: Tracing and Extraction of Runtime Injected Code

Malware analysis is Fun!

It is particularly satisfactory if the analyst manages to identify and somehow extract the hidden core logic/component of a given malware which is crypted and protected in order to hinder analysis efforts.

In reality, most malware we have encountered are either packed partially or uses in-memory injection of decoded/unpacked core logic at runtime to evade Anti Viruses. It is usually trivial to unpack such malware using a Debugger or IDA with Debugger plugin. We have rarely, but indeed, experienced malware protected with advanced engines like Themida, VMProtect etc. which are non-trivial to analyse due to extensive code obfuscation or virtualised instructions and multiple anti-debugger techniques.


Perhaps the most commonly used crypter technique is RunPE. The original executable is encoded/encrypted and somehow embedded inside a Stub Executable either in a PE section or as EOF data or as a Resource. The Stub Executable in turn decodes/decrypts the original executable in-memory at runtime and uses RunPE engine to load and execute it.

The RunPE technique consists of the following steps:
  • Execute a host process (HP) (say notepad.exe) with CREATE_SUSPENDED flag set.
  • Identify the ImageBaseAddress of the original executable (OE) to be loaded from its PE Header.
  • Attempt to allocate memory for OE in HP's address space at OE's expected based address.
  • If OE's expected base address not available, unmap HP's original mappings and allocate memory.
  • Write OE's PE sections into the address space of HP using WriteProcessMemory(..) API.
  • Change the ImageBase in HP's PEB if required.
  • Resume execution of main thread in HP.
The crux of the technique lies in OS's support for the CREATE_SUSPENDED flag in CreateProcess(..) API. This flag tells the kernel to suspend the main thread of the newly created process immediately after the PE is loaded and the sections are mapped. At this point, another thread must call ResumeThread(..) API on the main thread before control is transferred to NTDLL's LdrInitialize(..) and the process is loaded in the usual manner (Import Resolution, Base Relocation etc.)

The RunPE technique involves replacing the original program image from the address space of the newly created process with that of the program it intends to execute. In memory execution of a program will involve doing everything which the OS's Program Loader does. In order to avoid doing everything by itself, the RunPE technique lets the OS do everything, just that it replaces the original program content with its own payload in time ie. after the target program is mapped into the address space of the newly created process but before the PE Loading process is initiated.

Extracting executables protected with RunPE like crypters are usual trivial as it involves setting break points in memory allocation and memory write routes and dumping decoded/decrypted content using a debugger. However things get a little tricky with a bit of obfuscation and anti-debug ...

The VMProtect Story

In the past, we had come across a malware (stage1 really) which simply does the following:

  • Fetch an encrypted DLL (stage-2) over the Internet via. HTTP
  • Decrypt the DLL in-memory
  • Inject it into explorer.exe address space.

Essentially the core logic of the malware resided in Stage-2 DLL however due to custom compression & encryption, it was not possible to obtain the DLL for further analysis just from a pcap dump. Usually the next approach would be to use a Debugger to trace the Stage-1 executable, set appropriate breakpoints and obtain a dump of the decrypted DLL before it is being injected into explorer.exe.

However it turned out that the Stage-1 was protected with VMProtect with Debugger detection turned on. It was not possible to use a debugger to trace Stage-1 as it was detecting a debugger presence (we did not try a kernel debugger that time) and halting execution.

At that time, we solved the issue using a Pin Tool. Since PIN does not use Windows Debugging API for injection or management of PIN Tool, it was possible to trace execution of Stage-1 using a PIN Tool. We could hook WriteProcessMemory(..) and extract the decrypted DLL from Stage-1 for further analysis.


The idea for HiDump was conceived based on our experience with analysing protected malware with anti-debugging capability. Our objective was to devise a generic technique for extraction of data/code written using WriteProcessMemory(..)  Our implementation will attempt to avoid using Windows Debug API in order to play nice with Anti-Debugger checks.


The system consists of 2 core components:
  • Loader: Execute target executable with 'Monitor' injected into it.
  • Monitor: A DLL that hooks Windows API for monitoring and data extraction.
  1. Start Target exe with CREATE_SUSPENDED Flag
  2. Inject Monitor DLL in the address space of Target exe
  3. Resume Main Thread of Target exe (continue execution)
  • Hook OpenProcess, VirtualAllocEx, WriteProcessMemory, CloseHandle, CreateRemoteThread
  • Build State Machine for Data/Code Capture as per Hook Trigger Events
  • On End-State, dump data to disk
The Monitor is implemented using Microsoft Detour library for API Hooking. Essentially the monitor hooks following Windows API:

The Monitor internally maintains a State-Machine that starts with OpenProcess(..) and usually ends with CreateRemoteThread(..). The reason for maintaing a State-Machine is to record all memory write in the same order and offset in which a given block of data is written in the address space of the target process. When an event occurs that is marked as an End-State such as a call to CreateRemoteThread(..) the monitor attempts to dump all data recorded for that corresponding context (consisting of Process Handle, Allocated and Written Memory).

This technique however lacks the capability to identify and extract code injections using SetWindowsHookEx(..) or QueueUserAPC(..) APIs. However we believe the tool can be extended to consider those cases as well.

Proof of Concept Implementation:

A proof of concept implementation is available in Github.

Sample Execution against VMP Protected Executable

Monitor logs captured using syelogd.exe