Introduction - Tracing Memory Access with an LLVM Pass

This blog post details how to implement an LLVM Pass that allows for tracing memory access - by modifying binaries to augment each memory access opcode with tracing logic.

LLVM Passes are an extraordinarily powerful tool that allow us to hook our own logic into the LLVM compilation process - by leveraging an extensive API that works on LLVM IR, we can run custom passes that belong to one of three categories:

Analysis
Transform

Utility

LLVM Passes are similar in spirit to GCC plugins, but with richer documentation and a more fleshed out ecosystem. The LLVM Project invests significant effort in a clean codebase - and the effort shines when developing the passes.

Initial Motivation - Backstory

My initial interest in LLVM Passes began in a previous project I worked on - I was compiling code and running it in a system with:

Full fledged networking capabilities, i.e.
- The ability to open sockets, accept connections, open connections
No built-in debugging facilities
- No ability to remotely attach to the running code
- No OS primitives to allow for implementing GDB’s Remote Serial Protocol (i.e. no ability to implement a functional gdbserver alternative)
- Limited tracing capabilities

I often found myself frustrated that even though I was able to establish full network communications with the system, I was unable to properly debug my code beyond black-boxing and the limited tracing capabilities afforded to me. I found myself thinking something along the lines of:

Note

What if every single opcode was augmented with a network protocol that would communicate with a debugging host?

What this means is - for every single opcode - we would have code that would send a packet out to a host to notify it of the current debugging state, and wait for further commands (standard step/ continue / run to)

This is a pretty crazy fucking idea, because:

We would effectively have full debugging control over the target system - utilizing only standard networking capabilities
We could leverage this idea to implement GDB’s Remote Serial Protocol and allow for full gdb-based debugging

There are some stark disadvantages that immediately come to mind:

Performance would take a massive beating - every opcode would suddenly have an overhead of hundreds if not thousands of additional opcodes that would handle the debuggee-debugger communications. In practice the overhead would be ameliorated by the usage of breakpoints - not every opcode would have to go through a full networking session, but the overhead would still be tremendous
Code size would take a similar beating
Since we want to debug on opcode-level granularity - we would need to be very very careful when implementing the surrounding debugging logic, to make sure we’re not thrashing the state

But how is this even possible? We’re not talking about function-level hooks - which would be easy to implement with something like GCC’s -finstrument-functions - we’re talking about opcode-level hooks. This is beyond the level of what we can accomplish with source code modifications - we need to insert logic into the compilation process itself.

This is exactly the sort of purpose that LLVM Passes and GCC Plugins can serve.

From a technical standpoint, this is really really really cool, because:

We become all-powerful in the compilation process. The compiler is no longer a black hole that takes our source code and outputs opcodes at will - we can force ourselves into this process and run arbitrary logic that can allow for debugging, tracing, runtime optimizations, and much more. Cool cool cool!
We can reduce source code-level complexity and offload the complexity to the compilation process. Rather than writing source code-level hooks, or peppering our code with source code-level traces and boilerplate - we can write clean, to-the-point code, and offload boilerplate processes to LLVM passes. Also cool cool cool!

I got excited about the idea, but it was of course very very ambitious and it was definitely not worth the ROI in the context of the specific project.

So I stowed away the idea for the future, and settled down to work on a wildly less ambitious idea, though not too different in its nature:

A Less Ambitious Idea - Memory Traces

Let’s say that - for whatever reason - we want to know about every single access to memory, that is - reads and writes.

Let’s say we have code like this:

int *a = 0x12345678;
int *b = 0xDEADBEEF;
 
*b = 0xF00;
 
*a = *b;

and let’s say that we want to trace these memory accesses to a file, something like:

[Write]: Wrote value 0xF00 to address 0xDEADBEEF
[Read]: Read value 0xF00 from address 0xDEADBEEF
[Write]: Wrote value 0xF00 to address 0x12345678

Then we could write code like this:

FILE* fp = fopen("trace.log", "w+");
int *a = 0x12345678;
int *b = 0xDEADBEEF;
 
*b = 0xF00;
fprintf(fp, "[Write]: Wrote value 0x%lx to address %p\n", *b, b);
 
*a = *b;
fprintf(fp, "[Read]: Read value 0x%lx from address %p\n", *b, b);
fprintf(fp, "[Write]: Wrote value 0x%lx to address %p\n", *a, a);

This small example illustrates the problem - there is no way to hermetically cover all places in our C source code that read or write from memory. We have to manually find all of these places and manually add our traces - this is entirely untenable in any real codebase, not to mention the fact that it would muddy up our actual logic with ugly tracing logic. We would also have no guarantee that the compiler doesn’t introduce any “surprise” memory reads or writes that we can’t pick up on from the source code.

What we’re trying to do here is fundamentally unsuited for source code-level work - what we really want to do is tell the compiler:

Note

Every time you emit a load or store opcode - supplement it with tracing logic that traces what’s about to be loaded or stored.

You might ask, “but why”? To which I’d answer:

Why not?
This tracing could be useful for understanding memory access patterns
This is a relatively simple POC that demonstrates and fleshes out concepts that can be relevant for more complex ideas, such as the network-based debugging from the initial motivation

The next page outlines some helpful resources and setup tips - but if you want to get down to brass tacks, this page starts the implementation of the pass itself.