We want our pass to add a function to the binary that performs this logic:
void traceMemory(void *Addr, uint64_t Value, bool IsLoad) {
if (IsLoad)
fprintf(_MemoryTraceFP, "[Read] Read value 0x%lx from address %p\n", Value, Addr);
else
fprintf(_MemoryTraceFP, "[Write] Wrote value 0x%lx to address %p\n", Value, Addr);
}Notice how the calls to fprintf write to a FILE *_MemoryTraceFP - where should this file
pointer come from? We could insert fopen/fclose logic directly into traceMemory - but then we
would be opening and closing our log on each memory access, which is wasteful.
The preferable solution is to define a global FILE *_MemoryTraceFP - and initialize it just once.
Much like C, LLVM IR does not allow for top-level initialization of a global variable like this,
i.e. we cannot define
FILE *_MemoryTraceFP = fopen(...);
on a global scope. So where should we initialize the file pointer?
The appropriate place for this initialization is right at the beginning of main. Since we’re
implementing a module pass, we’ll need to identify the module that defines main - and add
initialization opcodes to main's entrypoint.
Step One - Add A Global File Pointer to Our main Module
All we want to do is add a file pointer to the main-containing-module’s global scope - so that we
can use it in our traceMemory function.
Let’s start fleshing out our pass’s run function:
llvm::PreservedAnalyses run(llvm::Module &M,
llvm::ModuleAnalysisManager &) {
Function *main = M.getFunction("main");
if (main) {
addGlobalMemoryTraceFP(M);
errs() << "Found main in module " << M.getName() << "\n";
return llvm::PreservedAnalyses::none();
} else {
errs() << "Did not find main in " << M.getName() << "\n";
return llvm::PreservedAnalyses::all();
}
And let’s define addGlobalMemoryTraceFP as follows:
const std::string FilePointerVarName = "_MemoryTraceFP";
void addGlobalMemoryTraceFP(llvm::Module &M) {
auto &CTX = M.getContext();
M.getOrInsertGlobal(FilePointerVarName, PointerType::getUnqual(Type::getInt8Ty(CTX)));
GlobalVariable *namedGlobal = M.getNamedGlobal(FilePointerVarName);
namedGlobal->setLinkage(GlobalValue::ExternalLinkage);
}In essence, all we do is define an externally-linked int8 *_MemoryTraceFP and add it to our
module using llvm::Module::getOrInsertGlobal.
Note
What linkage should our global file pointer have? We’ll only need it inside traceMemory - which
we can add to the same compilation module as main.
traceMemory will need external linkage - because we’ll want to call it from all modules - but the
file pointer itself should have internal linkage.
Why then do we use GlobalValue::ExternalLinkage and not GlobalValue::InternalLinkage? Because
LLVM seems to have weird behavior - I suspect a bug (I used the LLVM 13 toolchain) - where if we
use InternalLinkage we get this error:
”Global is external, but doesn't have external or weak linkage!”
We can see the effects of this pass:
> cat main.c
int main() { return 0; }
> clang -S -emit-llvm main.c
> opt -load-pass-plugin ./lib/libMemoryTrace.so -passes=memory-trace main.ll -S
...
@_MemoryTraceFP = external global i8*
..Cool! Now let’s add an initialization of this global variable to our main function:
Step Two - Initializing the Global File Pointer in main
Let’s add another function call to our pass’s run function:
llvm::PreservedAnalyses run(llvm::Module &M,
llvm::ModuleAnalysisManager &) {
Function *main = M.getFunction("main");
if (main) {
addGlobalMemoryTraceFP(M);
addMemoryTraceFPInitialization(M, *main);
errs() << "Found main in module " << M.getName() << "\n";
return llvm::PreservedAnalyses::none();
} else {
errs() << "Did not find main in " << M.getName() << "\n";
return llvm::PreservedAnalyses::all();
}
}Now, let’s think about what needs to happen in our addMemoryTraceFPInitialization function:
-
We need to make sure we can use
fopenby usingllvm::Module::getOrInsertFunction -
fopenhas this signature:FILE *fopen(const char *filename, const char *mode)And so we need to define the
filenameandmodewe will pass into the call tofopen -
We need to introduce an actual call to
fopento the beginning ofmain
The implementation of these three stages is hidden away in toggles for brevity.
Putting it all together, we get:
void addMemoryTraceFPInitialization(llvm::Module& M, llvm::Function &MainFunc) {
auto &CTX = M.getContext();
std::vector<llvm::Type*> FopenArgs{
PointerType::getUnqual(Type::getInt8Ty(CTX)),
PointerType::getUnqual(Type::getInt8Ty(CTX))
};
FunctionType *FopenTy = FunctionType::get(PointerType::getUnqual(Type::getInt8Ty(CTX)),
FopenArgs,
false);
FunctionCallee Fopen = M.getOrInsertFunction("fopen", FopenTy);
Constant *FopenFileNameStr = llvm::ConstantDataArray::getString(CTX, "memory-traces.log");
Constant *FopenFilenameStrVar = M.getOrInsertGlobal("FopenFileNameStr", FopenFileNameStr->getType());
dyn_cast<GlobalVariable>(FopenFilenameStrVar)->setInitializer(FopenFileNameStr);
Constant *FopenModeStr = llvm::ConstantDataArray::getString(CTX, "w+");
Constant *FopenModeStrVar = M.getOrInsertGlobal("FopenModeStr", FopenModeStr->getType());
dyn_cast<GlobalVariable>(FopenModeStrVar)->setInitializer(FopenModeStr);
IRBuilder<> Builder(&*MainFunc.getEntryBlock().getFirstInsertionPt());
llvm::Value *FopenFilenameStrPtr = Builder.CreatePointerCast(FopenFilenameStrVar, FopenArgs[0],
"fileNameStr");
llvm::Value *FopenModeStrPtr = Builder.CreatePointerCast(FopenModeStrVar, FopenArgs[0],
"modeStr");
llvm::Value *FopenReturn = Builder.CreateCall(Fopen, {FopenFilenameStrPtr, FopenModeStrPtr});
GlobalVariable *FPGlobal = M.getNamedGlobal(FilePointerVarName);
Builder.CreateStore(FopenReturn, FPGlobal);
}If we run our pass, we can see that memory-traces.log is created!
> ls
# Does not contain memory-traces.log!
> opt -load-pass-plugin ./lib/libMemoryTrace.so -passes=memory-trace main.ll -S -o modified_min.ll
Found main in module main.ll
> clang modified_main.ll -o modified_main
> ./modified_main
> ls
# Contains memory-traces.log!!!This is really cool!! We initiated the creation of a file from our executable using nothing but LLVM IR instructions introduced by our LLVM Pass. Magic!
The generated LLVM IR looks cool too:
@_MemoryTraceFP = external global i8*
@FopenFileNameStr = global [18 x i8] c"memory-traces.log\00"
@FopenModeStr = global [3 x i8] c"w+\00"
define dso_local i32 @main() #0 {
%1 = call i8* @fopen(i8* getelementptr inbounds ([18 x i8], [18 x i8]* @FopenFileNameStr, i32 0, i32 0), i8* getelementptr inbounds ([3 x i8], [3 x i8]* @FopenModeStr, i32 0, i32 0))
%2 = load i8*, i8** @_MemoryTraceFP
store i8* %1, i8** @_MemoryTraceFPUp next - implementing
our traceMemory function using this global file pointer.