Replacing TSAN’s runtime with a mock library that does nothing – and why that’s useful.
ThreadSanitizer (TSAN) is a powerful data race detector built into Clang and GCC. It instruments every load, store, and synchronization operation in programs, then tracks how they interact at runtime. This instrumentation is essential for TSAN’s magic, but it’s also a fantastic hook for other kinds of dynamic analyses.
In this post, I’ll explore how to (mis)use the instrumentation provided by -fsanitize=thread not to find data races, but to build our own checkers. Along the way, I’ll share what I learned about how the compiler wires TSAN’s hooks, why LD_PRELOAD isn’t enough, and how to swap the runtime safely.
The result is libtsano: a drop-in TSAN replacement that runs real programs, executes atomics correctly, and otherwise ignores the instrumentation. It forms a blank canvas for further checkers.
When you compile a program with -fsanitize=thread, the compiler instruments memory and control operations by emitting calls such as:
void __tsan_read4_pc(const void* addr, void* pc);
void __tsan_write8_pc(const void* addr, void* pc);
void __tsan_func_entry(void* pc);
void __tsan_func_exit(void);
void __tsan_acquire(void* addr);
void __tsan_release(void* addr);
These functions are mainly annotations to inform TSAN about the program execution. In contrast, atomic operations are fully delegated to the TSAN runtime library. Examples are:
uint32_t __tsan_atomic32_load(const volatile uint32_t* a, int mo);
void __tsan_atomic32_store(volatile uint32_t* a, uint32_t v, int mo);
int __tsan_atomic32_compare_exchange_strong(volatile uint32_t* a,
uint32_t* c, uint32_t v, int mo);
When the linker runs with -fsanitize=thread, it links the TSAN runtime library with the program. Depending on the compiler, it might default to a shared or static runtime library. For runtime swapping, I want shared TSAN (.so / .dylib) so the dynamic linker resolves those __tsan_* at program start. See below for the flag how to enforce this.
LD_PRELOAD isn’t enough)Interception here means getting my definitions of the __tsan_* functions resolved instead of the originals. To do that, I need to place my shared library earlier in the dynamic linker’s search order.
Binaries built with -fsanitize=thread have the real TSAN embedded as a DT_NEEDED (ELF) or LC_LOAD_DYLIB (Mach-O) dependency. That means the dynamic linker will still load the real TSAN library, even if all __tsan_* symbols are intercepted by my library. Consequently, the TSAN runtime runs its constructors, initializes global state, creates extra threads, and may intercept functions of other libraries (such as pthread_create).
In summary, loading TSAN together with an interceptor library usually leads to double initialization and causes TSAN to abort at program exit because its internal consistency checks no longer match expectations.
So, I had to change my plan to this:
libtsan,libtsan is expected to handle them.That’s what libtsano does: all hooks are no-ops, except atomics, which are implemented with compiler builtins.
/* Hooks: no-ops */
void __tsan_func_entry(void* pc) { (void)pc; }
void __tsan_func_exit(void) {}
void __tsan_read4_pc(const void* a, void* pc) { (void)a; (void)pc; }
void __tsan_write8_pc(const void* a, void* pc) { (void)a; (void)pc; }
void __tsan_acquire(void* addr) { (void)addr; }
void __tsan_release(void* addr) { (void)addr; }
Atomics use compiler builtins with __ATOMIC_SEQ_CST semantics to preserve correctness:
uint32_t __tsan_atomic32_load(const volatile uint32_t* a, int mo) {
(void)mo;
return __atomic_load_n(a, __ATOMIC_SEQ_CST);
}
void __tsan_atomic32_store(volatile uint32_t* a, uint32_t v, int mo) {
(void)mo;
__atomic_store_n(a, v, __ATOMIC_SEQ_CST);
}
int __tsan_atomic32_compare_exchange_strong(volatile uint32_t* a,
uint32_t* c, uint32_t v, int mo) {
(void)mo;
return __atomic_compare_exchange_n(a, c, v, 0,
__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}
The full implementation in tsano.c covers most bit-widths and RMW operations.
syscall.c stubs out sanitizer-specific syscall hooks as empty functions, keeping the runtime inert and minimal.
At the time of writing, these are the current numbers:
| Category | Approx. count | Notes |
|---|---|---|
| Memory/sync events | ~20 | __tsan_read*/write*, acquire/release, mutex, func entry/exit |
| Atomics | ~60–80 | All integer widths + weak/strong variants |
| Fences/init/thread stubs | ~10 | initialization, finalization, syscall wrappers |
→ Total: about 90–110 functions
All of them are defined as extern "C" symbols with the same signature as in libtsan.
There was a lot of repetitive work in this implementation, so I ended up using a small templating tool to expand similar functions such as __tsan_read2, __tsan_read4, __tsan_read8, and others. I’ll cover that approach in a future post.
The alternative to preloading is to completely replace libtsan before runtime, but after compilation and linking.
ELF-based systems such as Linux and NetBSD (and likely other BSDs) rely on several mechanisms to locate the shared libraries linked with a program:
LD_LIBRARY_PATH: the highest-priority runtime search path (ignored for setuid binaries).RUNPATH: a tag embedded in the binary at link time, listing directories to search./lib, /usr/lib, etc.On older systems, you may encounter RPATH instead of RUNPATH. When a library is found through RPATH, the dynamic linker does not search LD_LIBRARY_PATH. With RUNPATH, however, LD_LIBRARY_PATH still takes precedence. Fortunately, most modern systems now use RUNPATH.
Hence, if one places (or symlink) the custom shared library under the exact name expected by the binary in a directory listed first in LD_LIBRARY_PATH, the dynamic linker will pick that instead of the vendor’s TSAN. This works consistently on Linux with glibc, musl, on NetBSD, and probably all other BSDs.
On macOS, one should use DYLD_LIBRARY_PATH (or DYLD_FALLBACK_LIBRARY_PATH) to prioritize the replacement. TSAN on macOS is typically named libclang_rt.tsan_*.dylib. Placing the replacement dylib under that name in a directory listed first in DYLD_LIBRARY_PATH ensures the dynamic linker will pick it instead of the system’s TSAN runtime.
libtsano starts no threads, installs no signal handlers, and doesn’t patch syscalls, but keeps atomics correct. That makes it an ideal foundation for other runtime extensions.
Now I can layer a second shared library via LD_PRELOAD (ELF) / DYLD_INSERT_LIBRARIES (macOS) that interposes the same symbols and adds behavior:
instrumented app → your-checker.dso → libtsano (atomics + stubs) → OS
The checker can log, analyze, or enforce synchronization while delegating atomic operations to libtsano. No background interference, no global state conflicts.
By interposing I mean: intercept the call, do your work, then forward to the next implementation, e.g., dlsym(RTLD_NEXT, ...) on ELF, or DYLD_INTERPOSE on macOS. Interposition has plenty of gotchas and I might cover them in a later post.
Mocking TSAN is fun, not because I dislike TSAN, but because it’s a perfect entry point into compiler-driven instrumentation.
Once you have a no-op runtime that the compiler already trusts, you can bend it to do anything you want: tracing, invariant checking, your own race detector, or even record/replay. The best part is that you neither need to patch Clang or GCC nor build fragile compiler plugins to do it.
The dynamic linker / loader runs at process startup before the main() executes.
The linker is either ld-linux.so (glibc), ld.elf_so (NetBSD), ld-musl.so (musl), dyld (macOS) or equivalent. The dynamic linker is responsible for mapping all dependent shared library, resolving symbol relocations, running constructors (.init_array), and, finally, transferring control to your program’s entry point.
ldd, readelf -d, or otool -L and make sure your symlink matches it.-shared-libsan flag to enforce that on Clang.LD_LIBRARY_PATH/DYLD_LIBRARY_PATH for security reasons. This trick won’t work there.DYLD_INSERT_LIBRARIES (not DYLD_LIBRARY_PATH) to inject the checker on macOS; DYLD_LIBRARY_PATH is for replacing dependencies.dlsym(RTLD_NEXT, ...) is the usual way to “call through.” On macOS, prefer DYLD_INTERPOSE for clean forwarding.© 2025 db7 — licensed under CC BY 4.0.