Use-After-Free in C: Why It Happens, How Static Analyzers Catch It, and What Rust Does Differently
Introduction: Why Use-After-Free Doesn't Always Crash
Memory is a crucial topic when it comes to building software. Programs running on historical devices referenced physical memory locations directly, a practice superseded by a dedicated hardware component: the Memory Management Unit (MMU). We now work with virtual memory addresses that is managed by MMU for us[0]. For a hands-on exploration of how process memory is laid out on macOS ARM64, see the memory post.
malloc and free: Why Accessing Freed Memory Can Silently Succeed
I recently stumbled across a video regarding the RustTM language. The intent of this post is not to look down on the RustTM language but to observe a behaviour of a trivial C program. A program fragment was shown in the video, similar to the one shown in Listing 1. After building the executable, we can notice that the program exits normally. But before that, we tried to access a region of memory that was "freed". In contrast, RustTM informs this issue during compilation.
#include <stdlib.h>
int
main (void)
{
int *x = malloc(sizeof(int));
free(x);
*x = 0xA455;
exit(0);
}
One would assume that the program under execution for Listing 1 (as seen in Listing 2) should have received the segmentation violation signal (SIGSEGV) as it made an attempt to access a freed memory region. This is a classic use-after-free bug, yet the process exits normally — no crash, no signal, no warning. This is more complex that it is described here. In essence, malloc-like memory allocation library functions does more than just allocating memory. Demystifying malloc is not the purpose for now. Interested readers can browse link [1] provided at the end. To request any operation from the system, we use system calls. To request more memory for a process, we use the mmap(2) system call.
Script started on Wed Aug 27 16:13:10 2025
bash-3.2$ ./listing1
bash-3.2$ echo $?
0
bash-3.2$ exit
Script done on Wed Aug 27 16:13:16 2025
The output for the program listing1 (and others) is captured using the script(1) utility. The output from this utility may contain control characters that are then removed using the col(1) command. The command is used as follows:
$ SHELL=/bin/bash script <output-file-name>
...
$ col -b < <input-file-name> > <output-file-name>
My default shell is zsh. By defaut, script(1) will use the environment variable SHELL as the shell process. My configuration of zsh contains coloring and other "special" characters that appear in the output file. Unfortunately, even the col (with the given flag) is not able to clean out the terminal output. For simplicity, I chose to show the output from the bash shell.
mmap and munmap: When Accessing Freed Memory Does Crash
How mmap Differs from malloc for Memory Allocation
As the name suggests, mmap(2) is used to map a file described by a file descriptor into the process's virtual memory. But it also allows anonymous mapping, which is how it can serve as a lower-level alternative to malloc for memory allocation.
As the name suggests, mmap(2) is used to map a file described by a file descriptor into the memory. But it also
allows anonymous mapping. It is more genral than malloc-like library functions. For instance,
we can specify the protection of the memory region. By default, this system call assumes that the caller wishes
to map a region of file into the memory. We need to explicitly state we intend to map anonymous memory and
not a file. The return value from this system call defines the starting address of the mapped memory. This call is implementation defined and this address lies somewhere between the stack and heap of a process — regions we explored in detail in the process address space layout post.
Listing 3 shows an program identical to the one shown in Listing 1. The first argument to mmap(2) takes an
address that the kernel will use as a "hint" as to where the starting address of the mapped region will
be placed. Unless the MAP_FIXED flag is used, any previous mapping done in the requested address is not
replaced. If mmap(2) with MAP_FIXED flag is called and the first argument is an address that already
contains a previous mapping, upon successful return, the previous mapping is replaced. The use of
MAP_FIXED is discouraged if portability is a consideration.
The following program replaces malloc and free with mmap and munmap to allocate and release memory directly through system calls, demonstrating that the same use-after-free access pattern now triggers a segmentation fault.
#include <sys/mman.h>
#include <stdlib.h>
int
main (void)
{
int *x = mmap(NULL,
sizeof(int),
PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE,
/* ignored */ -1,
/* ignored */ 0);
munmap(x, sizeof(int));
*x = 0xA455;
exit(0);
}
When the program from Listing 3 is compiled and executed, we see the behavior as seen in Listing 4. The process terminates due to segmentation violation. When this program is executed inside a debugger, you'll notice that the signal is received when the memory address is dereferenced for assginment of a value.
Script started on Wed Aug 27 16:18:50 2025
bash-3.2$ ./listing3
Segmentation fault: 11
bash-3.2$ echo $?
139
bash-3.2$ exit
Script done on Wed Aug 27 16:19:01 2025
Why munmap Causes a Segfault but free Does Not
Like mentioned earlier, malloc(3) does not simply allocate a memory and return the address of the allocated
memory region. This function internally performs various memory management operation (as can be seen on
musl's implementation.) Like we've seen in Listing 3, segmentation violation occurs if the received memory
region from mmap(2) was unmapped using munmap(2). I have yet to explore the actual implementation of the
free function, but at a glance, it looks like the process "advises" the system that the information
contained in the memory region is not need and can be reused right away. A process provides such advise
to the system through the madvise system call. Indeed, the free(3) function will eventually invoke the
munmap(2) system call.
Catching Use-After-Free at Compile Time with Static Analysis
clang --analyze and gcc -fanalyzer: Static Analyzers for C
The Rust compiler is able to detect such errors by virtue of static analysis of the source file, specifically through its ownership model and borrow checker, which enforce memory safety rules at compile time. During compilation, the rust compiler performs various checks that most compilers perform for the respecitve language. In addition to this, the RustTM compiler statically analyzes the source file for memory related issues such as this one (use-after-free) along with other ownership model checks. This does not mean that compilers for the C language does not support this feature. For instance, gcc provides the -fanalyzer option that can be used during compilation to perform inter-procedural analysis. clang from llvm also provides similar feature, but the option is called --analyze. clang also provides a command-line utility called scan-build that is used during the build process.
Address Sanitizer and UBSan: Runtime Alternatives to Static Analysis
As we can see in Listing 5, the static analyzer reports that the program suffers from a use-after-free issue. I usually prefer the runtime analyzer over the static analyzer as the reporting is verbose. The compiler flag -fsanitize=address instruments the program with AddressSanitizer (ASan) so that issues like use-after-free are detected at runtime, with a detailed stack backtrace showing exactly where the invalid access occurred. It also provides a stack backtrace. Another one I frequently use is -fsanitize=undefined that is used to detect any undefined behavior during runtime (UBSan). Some of the potential usage of UBSan is to detect array subscripts out of bounds where the bounds can be statically defined [2], signed integer overflow, dereferencing misaligned or null pointers [3].
These sanitizer flags are among the most useful compiler options for debugging; for more on compiler flags and the C compilation pipeline, see the portability post.
The following output shows clang --analyze detecting the use-after-free bug at compile time, without needing to run the program — reporting exactly which line dereferences memory that was previously passed to free.
Script started on Wed Aug 27 16:28:49 2025
bash-3.2$ clang --analyze -DLISTING1 segfault.c
segfault.c:107:6: warning: Use of memory after it is freed [unix.Malloc]
*x = 0xA455;
~~ ^
warning generated.
bash-3.2$ exit
Script done on Wed Aug 27 16:29:15 2025
Tradeoffs: Compile Time Cost vs Bug Detection Coverage
Static analysis has its pros and cons. It ensures that the program being built is hardened from some of the commonly found bugs. But it does come with a tradeoff; compilation time. Both gcc and clang mentions that static analysis is expensive compared to other warnings flags [4].
References
- [0]Before the advent of MMU, programs used physical memory location in RAM for various operations. This was an issue that would requires its own blog. LaurieWired made a video to discuss about virtual memory addressing called How a Clever 1960s Memory Trick Changed Computing. It's safe to say that most general purpose computers have a dedicated MMU that handles the required translation of virtual memory address to physical address in RAM. The address space for a process is not distinct compared to other process's address space. In fact, if two programs was to use the same standard C library, those process's would probably load the library in identical address space although it is not a requirement.
- [1]https://git.musl-libc.org/cgit/musl/tree/src/malloc/mallocng/malloc.c
- [2]One might assume that this issue should be a consideration for ASan. Consider a function variable
foothat is an array of characters. Such function variables are stored in the stack, not in the data or the bss section, during runtime. For a program under execution, there probably won't be any segmentation violation if an attempt was made to access a subscript offoothat exceeds the declared size. This is cause a function's stack frame contains: instructions for the function, the architecture's ABI; function prologue and epilogue, and potential stack canary (that must not be tampered), and the local variables that are declared for the function. Most architectures use little endian, so there's a chance that a buffer overflow could be done such that the return address located at the beginning of the stack can be modified and can cause the function to return to some other function causing arbitrary code to be executed later. - [3]When I tried to run the programs from Unix Network Programming, by W.R. Stevens, some of them caused this runtime error. The buffered data from the network does not only contain ASCII text. For example, if the server sends a binary data for
struct timevalto the client, the client can't interpret the data and we need to assign it to a variable of typestruct timeval. A character variable is aligned to a 1-byte address, but it is not the same for astruct timevalvariable. If this structure was 8-byte aligned, then the address that has the 3 Least Significant bit (LSb) not zero would invoke memory alignment issue. - [4]https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html && https://clang-analyzer.llvm.org/
