3S Labs Banner

Wednesday, July 11, 2012

Analyzing Password Protected Word Documents

Recently we were tasked to analyze a malicious word (doc) document. On opening the document in Sandboxie, it was noticed that the document successfully exploited some vulnerability in Microsoft Word (Microsoft Office 2007 was used for testing) and dropped an executable as the payload.

The exploit package was found to be working across a wide configuration of Windows and Office products. Due to the reliable nature of the exploit, we were greatly intrigued to analyze the vulnerability it was exploiting.

Usually it is not very challenging to have a rough idea of the vulnerability being exploited by a doc file as we have various internal scripts which can dump various section of a doc file. However in this case, this particular sample was encrypted and a password is to be provided before the vulnerability is triggered. Since we were already provided the password, our job was only to analyze the vulnerability that is being exploited by the sample.

Using Ruby OLE, the Document Structure was found to be as below:

Initially we were hoping to find off-the-shelf tools which can decrypt or remove the password from a given word document. Failed to find anything usable, we wen't ahead trying to write a tool to decrypt the Word document for conventional analysis.

As per MSDN:
  • In a file that is password protected by using Office binary document RC4 CryptoAPI encryption as specified in [MS-OFFCRYPTO] section 2.3.5, FibBase.fEncrypted MUST be 1 and FibBase.fObfuscation MUST be 0.
  • The EncryptionHeader as specified in [MS-OFFCRYPTO] section MUST be written in unencrypted form in the first FibBase.lKey bytes of the Table stream. The remainder of the Table stream, the WordDocument stream beyond the initial 68 bytes, and the entire Data stream MUST be encrypted.
  • These three streams of data MUST be encrypted in 512-byte blocks. The block number MUST be set to zero at the beginning of the stream and MUST be incremented at each 512 byte boundary. The encryption algorithm MUST be carried out at the beginning of the Table stream and the WordDocument stream even though some of the bytes are written in unencrypted form.
  • If fDocProps is set in the EncryptionHeader.Flags, the Encryption stream MUST be present, the Summary Information stream MUST NOT be present, and a placeholder Document Summary Information stream MUST be present as specified in [MS-OFFCRYPTO] section
After a couple of hours of effort, it was realized that it really is not worth the effort to write a decryption engine for Word documents based on scarcely available documentation. Some half baked scripts for dumping various encryption related structure that we wrote is available here. In turn we turned to our swiss army knife, not really "a knife" but a "knives" really .. it was time to give up the current approach and turn towards Dynamic Analysis.

Finally we struck an idea to identify the shellcode execution in memory and set a break point in appropriate code where the shellcode is executed. The idea was quite simple: Since most Windows shellcode starts with a stub that attempts to resolve the base address of kernel32.dll by walking the loaded module list based on which LoadLibrary and GetProcAddress is resolved. We set a break-on-read on PEB loaded module list's head pointer.

Once the shellcode was interception, we took a minidump fo the word process which had the decrypted doc file in memory. Some effort will be required to recreate an unencrypted doc file from the memory dump but that was more or less enough the study and analyze the exploited vulnerability.