Root Cause

IsoAlloc - Uninitialized Read Detection

April 4th, 2021
Chris Rohlf

Uninitialized memory vulnerabilities are a class of memory safety issues that have plagued C/C++ developers for decades. The most concise description of this vulnerability class is when you allocate and use memory without first properly clearing that memory of the contents that was written there previously. A simple code snippet that shows this pattern is below:

  some_struct_t *sq = (some_struct_t *) malloc(32);
  isAuthorized(sq->authFlag);

This code never made a call to memset to clear the memory returned by malloc before using it. There is no guarantee that the backing page is not dirty from a previous write, as is often the case when memory is free'd and the chunk handed out to a new caller. These bugs are notorious to find because they don't immediately result in a signal being sent to the process. Finding them used to require heavy handed approaches like dynamic code instrumentation, and the 'PROT_NONE + SIGSEGV' trick. The latter is a bit of a hack and requires setting a pages permissions to PROT_NONE using mprotect and then setting up a signal handler to catch accesses to that page. The problem with this approach is it also catches writes to the page so its insufficient for detecting uninitialized reads except in very limited circumstances. The fine grained data we need related to the page fault just isn't accessible anywhere except in the kernel which means we need to write a kernel module if we want access to the information.

Eventually Memory Sanitizer arrived which brought fast-ish compiler based code instrumentation for detecting these kinds of errors. MSan works similar to Address Sanitizer by using shadow memory and compiler added code instrumentation. MSan is incredibly powerful and effective, but it comes at the cost of a 3x performance overhead which makes it not suitable for production builds except in rare circumstances. Many organizations currently deploy canary builds with ASAN/MSAN enabled to sample production traffic but this is still only useful in highly orchestrated backend service deployments.

A few years ago a new syscall appeared in the Linux kernel named userfaultfd. This syscall allowed for a basic page fault handler to be implemented in user space. It allows for very powerful features such as hooking events for remapping of pages, forked child processes, garbage collection features, and writing out of process page fault handlers for specific operations. The API to this syscall is used via a combination of calls syscall/read/poll/ioctl and is very complicated and difficult to use, but we can use it to register a range of page addresses with the kernel and then process an event in a special thread whenever a page fault occurs anywhere in that range. These events also tell us whether the page fault was the result of a read or a write. This is enough information for us to build a simple mechanism for detection of uninitialized reads in userspace with minimal performance overhead. Using these building blocks I developed this optional feature in IsoAlloc.

It works by first spawning a special thread to handle userfaultfd events, and then sampling calls to iso_alloc / malloc. These sampled calls are handled by first allocating a raw page to back the request, registers that pages with userfaultfd, returns the page to the caller, and then handling pagefault events from the kernel for that address range. If the first event seen for the page is a write we remove it from the list of pages we are tracking, if it's a read then we know its an uninitialized read or it wouldn't be on the list. Below we see this feature detecting an uninitialized read in our test program.

$ cat tests/uninit_read.c 
/* iso_alloc uninit_read.c
 * Copyright 2021 - chris.rohlf@gmail.com */

#include "iso_alloc.h"
#include "iso_alloc_internal.h"

int main(int argc, char *argv[]) {
    while(1) {
        uint8_t *p = iso_alloc(1024);
        uint8_t drf = p[128];
        p[256] = drf;
        iso_free(p);
    }

    return OK;
}

$ LD_LIBRARY_PATH=build/ build/uninit_read 
[ABORTING][86027](src/iso_alloc_sanity.c:78 _page_fault_thread_handler()) Uninitialized read detected on page 7fb6ce3cf000 (1024 byte allocation)
Aborted (core dumped)

There are some limitations to this approach. For starters its only available on specific versions of the Linux kernel. A majority of my testing was performed on AWS free tier with a 5.4.0 kernel, but it should work as far back as 4.11. Not all uninitialized read vulnerabilities will be detected with this technique, a caller could request 1024 byte allocation, clear the first 32 bytes with memset() and then access the 512th byte without initializing it. We wouldn't detect this because the page would have been removed from our tracking when those first 32 bytes were initialized. This is because userfaultfd only notifies us of page level information not individual memory addresses within the page. But given that we are sampling smaller calls to malloc, this is likely not a common case.

While the current implementation needs some work it is effective at detecting these kinds of vulnerabilities. The goal is to have a feature similar to GWP-ASAN but for uninitialized reads. It should be suitable for production given its low performance overhead, especially when only sampling calls to malloc. It's available in the IsoAlloc code today along with a basic GWP-ASAN like feature for detecting Use-After-Free and other vulnerabilities in production code. Like IsoAlloc itself, all of these features are under active development.