22

CVE-2019-1347: When a mouse over a file is enough to crash your system

 4 years ago
source link: https://blog.tetrane.com/2019/11/12/pe-parser-crash.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Analyze this yourself!

Discover Timeless Analysis Live.

CVE-2019-1347 is a vulnerability disclosed in october 2019 by Mateusz @j00ru Jurczyk in the Windows relocation mechanism when parsing a PE file. By simply placing your mouse cursor over the Proof of Concept file, a Blue Screen Of Death is triggered.

We thought the original description could be positively completed with an analysis with REVEN, our timeless analysis tool. For this analysis we recorded several short traces to isolate and understand how specific bytes in the PE led to the crash.

In total, we will show that exactly four locations are responsible for the crash, and how this can help understand the bug itself.

The minimal bytes to be modified from the original file are the following:

Offset Original New 0xd0 00 3f 0x120 00 c0 ff e7 0x169 20 af 0x1ef 42 ff

Note: Throughout this post, we will call these locations “byte”, even though the second one involves two bytes.

A fistful of bytes: First and second location

To begin with, we recorded the crash triggered by this PoC provided in the issue . From the KeBugCheck call we reach back the Page Fault and see that the address 0xfffff8035b2ae7ff is not mapped, and won’t be.

A closer look at the memcpy arguments shows that this address is built as 0xfffff8035b2a0000 + 0xe7ff , so we taint the value 0xe7ff to find where it comes from.

JRFFr2b.jpg!web

We instantly find out that this value comes unchanged from the PoC PE. Indeed, modifying this value in the PE disables the crash. Opening the PoC in any PE editor confirms that the value is related to the Relocation Directory RVA, as stated also by the issue .

The forward taint has an advantage compared to the backward taint: flags are tainted too. This can be tedious in many cases but here it turns out to be very effective when applied to 0xe7ff :

Fz6vmuU.jpg!web

Firsts tests are just zero tests. The important one is the comparison with 0xf000 . At first sight the alignment sounds like a simple overflow-related check, but it isn’t. We taint 0xf000 , it actually comes from the PE:

BrURVbI.jpg!web

The value 0xe03f is the field SizeOfImage from the PE header (modified in the PoC). The value 0xf000 is derived from this size as a 0x1000-aligned value. If we modify this entry to its original value, the crash doesn’t occur, proving that both of 0xe7ff and 0xe03f are directly linked to the crash.

We then tried to modify those two bytes in the original file, but unfortunately, it isn’t enough to trigger the BSOD. For the next parts we will determine the other needed bytes.

For a few bytes more

We decided to patch manually the file with the differences, in a -sort-of- dichotomic manner.

The idea is simple: using a PE editor, we compare the original file and the provided PoC. By applying/removing relevant modifications to the original file in correlation with the PoC, the two single bytes triggering the BSOD are isolated gradually. Note that we only had to perform this operation on the header part of the file, as the memory history reveals that the corpus of the file isn’t accessed.

This minimization could have been automated by a script that triggers and detect the BSOD in a VM, but in this case manual modification is enough.

Next part of this post analyzes how these two new bytes influence the CFG and eventually points out the bug. We recorded three traces more: the one induced by the minimal PoC (four locations modified), and two other traces where we let respectively the third byte and fourth byte unchanged. The two later tests don’t trigger the BSOD on purpose.

Third byte analysis

The third byte we modified is located at file offset 0x169 , where we replaced 0x20 by 0xaf . A PE parser shows that it is related to a directory RVA, just like the relocation RVA mentioned earlier.

To detect where the CFG changed, we used the following naïve script with the Analysis Python API, that compares two traces instruction by instruction:

while (instructions_are_equal(tr1, tr2)):
      # Fetch next instruction from first trace and second trace
      tr1_id += 1
      tr1 = rvn1.trace.transition(tr1_id)
      tr2_id += 1
      tr2 = rvn2.trace.transition(tr2_id)

The full script is available in Appendix 1.

The algorithm is effective enough as we will only focus on one function: MiRelocateImage . In a few seconds, we get this output:

python trace_diff.py --host1 localhost --port1 44389 --host2 localhost --port2 43633 --tr1 77868515 --tr2 65535121
INFO:    Finding difference between two traces...
INFO:    Instructions are different at 77868565 - 65535171
INFO:    Done

The results shows that the traces are divergent shortly after the beginning of MiRelocateImage , and the function exits almost right after a flag is tested:

jMVJ73q.jpg!web

Now we analyze why this flag is set to 0x1 . We can follow it in memory in the trace that doesn’t crash:

iUjQFfi.jpg!web

This shows that the flag 0x1 comes from a check on a value, 0xb , and that 0xb comes itself from the file, unchanged. But 0xb isn’t the byte we modified, so we can ask, why is it linked to the flag?

The answer is that 0xb is a value in a structure, and the modified value decides where this structure starts:

EBBRFnA.jpg!web

The value 0x2008 - the 3rd byte that we modified to 0xaf08 -, is responsible for pointing the beginning of a structure, and a value from this structure is checked to decide whether or not relocating the image at the beginning of MiRelocateImage . When the byte 0x20 is changed to 0xaf , the offset pointing the start of the structure changes, the value in the structure is then different (with a high probability), the derived flag isn’t set, the execution of MiRelocateImage continues, resulting eventually in the crash.

This third byte analysis doesn’t show the bug as itself, but nevertheless, it shows that REVEN can explain why it is important. Actually, instead of modifying the 3rd byte from 0x2008 to 0xaf08 , we can modify the aforementioned 0xb value to 0x0 . This causes the system crash also, proving that our analysis is correct.

For the next part, we will analyze the crash itself, the bug, and how it is linked to the 4th byte.

The crash, The bug, and The fourth byte

Taint forward against the fourth byte

Analyzing the fourth byte is indeed tricky.

First we can taint forward this byte from the moment it is parsed:

MBrIJvE.jpg!web

With IDA synchronized, we see that this byte is used as an index into the array MiImageProtectionArray , and the value fetched is 0x6 . Tainting this value 0x6 is also possible, yet in this case, the information given by the taint is verbose and difficult to analyze. We will continue this fourth byte analysis by having a look at the crash itself.

From the crash

From the beginning of this analysis, we only pointed out a page fault which couldn’t be resolved. But the page fault doesn’t seem to come from a common read overflow; the problem is elsewhere.

For this part we analyzed the code that precedes the KeBugCheckEx call:

yU7biiV.jpg!web

We can see multiple checks on a zero value, leading to the crash. This value comes from memory at the end of an array containing what looks like PTE.

The question is, does the problem come from fetching the wrong PTE? (i.e. bad index in the PTE array?) or is it because there should be an entry there that doesn’t exist?

Next part answer this question.

Is offset in PTE array wrong?

First we can try to analyze where and how this PTE address is build. We can taint this address to see where it comes from:

fUji2iy.jpg!web

The following code is responsible for the conversion:

0xfffff80359fa4ddb mov rcx, rdi 
0xfffff80359fa4dde movabs rdx, 0xfffffb8000000000
0xfffff80359fa4de8 shr rcx, 9
0xfffff80359fa4dec movabs r8, 0x7ffffffff8
0xfffff80359fa4df6 and rcx, r8
0xfffff80359fa4df9 mov rax, rdx
0xfffff80359fa4dfc add rcx, rax
0xfffff80359fa4dff mov qword ptr [rbp - 0x28], rcx

At the beginning of this code, rdi contains the address 0xfffff8035b2ae7ff , that needs to be mapped. This code doesn’t seem to have any flaw, so we can deduce that probably, the problem is that the entry containing zero should have been populated, yet it isn’t.

Entry in PTE array isn’t populated

It is very probable that this array is populated in a loop, so we can find out where other entries have been set and see why the empty one isn’t:

BBVRzyA.jpg!web

We reach MiAddMappedPtes . Basically, it takes as argument the amount of PTE to add, and adds them. Let’s taint backward this number (i.e. 0xf ), and almost immediately see that it comes from the first byte we’ve modified in the file:

QZBFNbb.jpg!web

The value 0xf corresponds to the required number of 0x1000-aligned blocks to handle a size of 0xe03f. This looks consistent, but the problem seems that even though 0xf entries should be added, only 0xe effectively are :

0xe_PTE_added_001_cut_red.png

Another thing we can do is executing the previous script to detect where the execution differs from a trace that doesn’t crash:

python trace_diff.py --host1 localhost --port1 44389 --host2 localhost --port2 43633 --tr1 77868515 --tr2 65535121
INFO: Finding difference between two traces...
INFO: Instructions are different at 77868565 - 65535171
INFO: Done

We analyze these results:

UFry2yv.jpg!web

At some point, a branch is taken to call GetSharedProtos instead of MiGetSubsectionDriverProtos . This is probably interesting but right now we don’t know enough about the context to exploit this information.

We need to analyze the loop termination conditions, at the end of the MiAddMappedPtes , to understand why only 0xe entries are added.

There are three consecutive checks:

0xfffff8035a50ea22 cmp rbx, rsi
0xfffff8035a50ea25 jae 0xfffff8035a50ea58 ($+0x33)

0xfffff8035a50ea27 cmp r11, rdi
0xfffff8035a50ea2a jae 0xfffff8035a50ea78 ($+0x4e)

0xfffff8035a50ea78 mov rbp, qword ptr [rbp + 0x10]
0xfffff8035a50ea7c test rbp, rbp
0xfffff8035a50ea7f je 0xfffff8035a65d5b6 ($+0x14eb37)

First condition check - SizeOfImage

It is pretty straightforward: the upper bound correspond to 0xf entries and it isn’t reached

vummqu7.jpg!web

The code just gets a pointer on the last entry of the array that will be filled.

Second condition check - Pointing out the chained list

The upper bound seems to be the number of 0x1000-aligned blocks needed for a section. Taint analysis shows that it is computed from the size of the section, located in the PE header.

uQFjY3N.jpg!web

More precisely, this number of page is set in a chained list entry, by MiParseImageSectionHeader :

N3a6Jbi.jpg!web

This chained list entry is important as the last check depends on it.

Third condition check - Do we need to add one last PTE?

If the next entry in the chained list is empty (i.e. the current one is the last section), the algorithm checks whether or not it should add more pages. And this is where the bug occurs: instead of comparing the amount of page already effectively added with amount of page to be added, it compares the address in an array with the last entry of… a completely different array .

In the trace where the 4th byte isn’t tampered with (the crash is disabled), the code adds a new entry properly, but not in the trace that crashes. Here is what we can see in both:

n2m2Ijn.jpg!web

Everything looks fine for the disabled test, but for the crash version, the first address compared is at a way higher address, hence the code considering that the upper boundary is reached and don’t need to add PTE anymore. The last entry isn’t populated, and the crash occurs when this last page is accessed, as no PTE exists for it.

Basically, what was intended:

// Check if another PTE is needed
if (&array[current] < &array[last])
{
  add_entry();
}

But in this case, arrays may differ, hence the comparison being faulty.

The 4th byte, the last piece of the puzzle, is demystified now.

4th byte: Forcing the bad comparison

Impact

Recall that the naïve trace differential showed that GetSharedProtos is called instead of MiGetSubsectionDriverProtos . The two arrays that are incorrectly compared, come from these functions, respectively. The bug could have stayed unnoticed but the fourth byte tampering forces the usage of GetSharedProtos instead of MiGetSubsectionDriverProtos , leading to the faulty comparison.

Origin

The branch that calls GetSharedProtos is taken because of the value 0x2 in memory, we can use the memory history on this word:

qymQVvU.jpg!web

Once again, we analyze the instructions right before this branch and see that the value in r8b is tested. This value is 0x6 , which for an attentive reader may ring a bell. Taint analysis may help to trace where it comes from, or in this case we just fetch where it has been modified a few instruction before.

Actually, the value 0x6 comes from MiImageProtectionArray (a result found previously), and the index in this array was derived from the byte we modified in the file. This is how the 4th byte influences the CFG to force the comparison and trigger the bug.

Once upon a patch in the kernel

We recorded a last trace against an updated version of Microsoft Windows, to see how the bug is fixed.

Whilst we were expecting some checks around the faulty array comparison, the patch just avoids the need to reach that code.

In MiAddMappedPtes , we saw earlier that 0xf PTE are to be added. The code goes through a chained list containing structures that seem to represent each section, and the amount of PTE needed for each of these section. Let’s go through this chained list:

InQfy2j.jpg!web

For each entry, we can see the pointer to the next entry at offset +0x10 , and the number of blocks to add at offset +0x2c . In the crash version, this is a recap of blocks needed:

  • section 1 needs 2 pages
  • section 2 needs 8 pages
  • section 3 needs 2 pages
  • section 4 needs 2 pages

0xf - 2 - 8 - 2 - 2 = 0x1 last page to be added. Previously, the last page was added with an ad-hoc faulty piece of code that we described earlier.

In the patched version, we can see the following:

  • section 1 needs 2 pages
  • section 2 needs 8 pages
  • section 3 needs 2 pages
  • section 4 needs 3 pages

The last section has one more page, hence no need to add (through the faulty code) a last page. We can analyze the last round of MiAddMappedPtes and see where this 0x3 comes from:

jUJFvin.jpg!web

For each section, MiParseImageSectionHeaders creates an entry, with (among other) the amount of 0x1000-block needed in it. This number is derived from the size defined in the PE header as we showed earlier.

Basically, the patch is: if there are still blocks to add compared to the SizeOfImage value ( 0xf blocks in total here), then when the last section is handled, the amount is replaced by the actual number of remaining needed pages.

The following pseudo code represents what is done in the patched version:

total_block = SizeToBlock(Image)
for each section:
  // HERE IS THE PATCH
  total_block -= SizeToBlock(section)
  if (IsLastSection(section)):
      current_nblock = total_block
  else:
      current_nblock = SizeToBlock(section)
  BuildEntry([...] , current_nblock, [...])

This ad-hoc check and add used to be in MiAddMappedPtes , they are moved now, so the faulty code isn’t executed anymore.

As such, the last PTE is correctly added and no crash occurs.

Conclusion

There is no previous analysis for this vulnerability at the time of writing. We showed how we could analyze it precisely with REVEN, minimized the PoC and explained the influence of each faulty byte. In particular, we used the taint feature many times to quickly go through many memory manipulation and find the origin of some values.

Even though the CFG is usually tedious to follow, thanks to the Python Analysis API and other features, we were able to point out where key branches were taken and analyze why. Moreover, we did analyze precisely how Microsoft patched this issue; and it is now easier to figure out if this patch is enough or not.

Finally, this logical error wasn’t trivial to analyze. The capability to navigate the trace in time instead of restarting again and again the parsing with a debugger allowed us to spare a fair amount of time.

Appendix 1

Naïve script to perform simple trace differential analysis. Given two traces and two transitions, this script returns the transition (“instruction number”) when the bytecode differs.

import argparse

import reven2
import logging

logging.basicConfig(format='%(levelname)s:\t%(message)s', level=logging.INFO)


def parse_args():
  parser = argparse.ArgumentParser(description='Find the first different instruction between two traces\n')
  parser.add_argument('--host1', metavar='host1', dest='host1', help='Reven host 1, as a string '
                      '(default: "localhost")', default='localhost', type=str)
  parser.add_argument('--port1', metavar='port1', dest='port1', help='Reven port for first server'
                      ', as an int (default: 13370)', type=int, default=13371)
  parser.add_argument('--host2', metavar='host2', dest='host2', help='Reven host 1, as a string '
                      '(default: "localhost")', default='localhost', type=str)
  parser.add_argument('--port2', metavar='port2', dest='port2', help='Reven port for second server'
                      ', as an int (default: 13370)', type=int, default=13372)
  parser.add_argument('--tr1', metavar='tr1', dest='tr1', help='Start transition for the first trace', type=int)
  parser.add_argument('--tr2', metavar='tr2', dest='tr2', help='Start transition for the second trace', type=int)

  args = parser.parse_args()
  return args


def instructions_are_equal(tr1, tr2):
  """
  From 2 transitions, return True if instructions are identicals.
  """
  return tr1.instruction.raw == tr2.instruction.raw

if __name__ == '__main__':

  args = parse_args()
  logging.info("Finding difference between two traces...")

  # Get a server instance for both traces
  rvn1 = reven2.RevenServer(args.host1, args.port1)
  rvn2 = reven2.RevenServer(args.host2, args.port2)

  tr1_id = args.tr1
  tr2_id = args.tr2

  tr1 = rvn1.trace.transition(tr1_id)
  tr2 = rvn2.trace.transition(tr2_id)

  i = 0

  while (instructions_are_equal(tr1, tr2)):
      # Fetch next instruction from both traces
      i += 1
      tr1_id += 1
      tr1 = rvn1.trace.transition(tr1_id)
      tr2_id += 1
      tr2 = rvn2.trace.transition(tr2_id)

      if i % 100 == 0:
          logging.debug("{0} are identicals".format(i))

  logging.info("Instructions are different at {0} - {1}".format(tr1.id, tr2.id))

  logging.info("Done")

Analyze this yourself!

Discover Timeless Analysis Live.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK