GDB for Multi-threaded C: Hunting Race Conditions Without Losing Your Mind
Debugging single-threaded C in GDB is mostly mechanical: set a breakpoint, step, print, repeat. Multi-threaded C is a different game. Bugs vanish under the debugger. Stack traces lie. Variables seem to change value between two prints.
These are the GDB techniques that actually help.
Know Where You Are
First, see all threads:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ff... "app" __futex_abstimed_wait
2 Thread 0x7ff... "worker-1" do_work () at worker.c:42
3 Thread 0x7ff... "worker-2" __libc_recv () at recv.c:27
The * marks the current thread. Switch with:
(gdb) thread 3
(gdb) bt
A backtrace from every thread — the single most useful command for multi-threaded freeze investigations:
(gdb) thread apply all bt
Look for thread chains: thread A waiting on a lock held by B, B waiting on a lock held by C, C blocked on the network. That pattern is a deadlock or a starvation.
Naming Threads
Unnamed threads are unreadable in info threads. Set names from the thread itself:
#include <pthread.h>
pthread_setname_np(pthread_self(), "net-rx");
Limit is 16 characters. Now info threads shows net-rx instead of app.
Conditional Breakpoints by Thread
A breakpoint that fires on every thread is useless when only one is interesting. Pin it:
(gdb) break worker.c:42 thread 3
(gdb) break worker.c:42 thread 3 if pid == 1234
The second form combines thread selection with a data condition.
Non-Stop Mode
Default GDB freezes all threads at any breakpoint. That's safe but distorts timing — the bug may not happen when other threads are paused.
Turn on non-stop mode:
(gdb) set non-stop on
(gdb) set target-async on
(gdb) run
Now only the thread that hit the breakpoint stops. Others keep running. Resume the stopped thread without affecting others:
(gdb) thread 3
(gdb) continue&
This exposes races that vanish with all-stop mode — the so-called Heisenbug effect.
Watchpoints
A breakpoint stops on a line. A watchpoint stops on a value.
(gdb) watch counter
(gdb) watch -l ptr->state // by location, survives ptr changes
(gdb) rwatch counter // read-only access
(gdb) awatch counter // any access
For a corrupted variable mystery: watch corrupted_field — GDB stops the moment any thread touches it. The current thread shows the perpetrator.
Hardware watchpoints are limited (typically 4 simultaneous). Software watchpoints still work but are 100× slower — GDB single-steps every instruction.
ThreadSanitizer First
GDB is for inspection, not race detection. For finding races, compile with TSan:
clang -fsanitize=thread -g -O1 ./app.c -o app
./app
TSan instruments memory accesses and prints the conflicting accesses, with stack traces of both threads. It catches races that may not yet manifest on your hardware. Use TSan to find them; use GDB to investigate them.
Most real-world race investigations are TSan output → GDB attach.
Reverse Debugging (when it's available)
With rr (record-replay), you record one execution and replay it deterministically as many times as you want. GDB inside rr lets you step backwards:
rr record ./app
rr replay
(gdb) reverse-continue
(gdb) reverse-next
(gdb) watch -l mystate
(gdb) reverse-continue // find the previous write
This is the killer feature for "how did this variable get this value?" Run backward to the last write. No more re-running and praying.
Attaching to a Live Process
Don't restart — attach:
gdb -p $(pgrep myapp)
This is essential for production debugging. With gcore, you can also snapshot a core dump from the live process and analyze it offline:
gcore $(pgrep myapp)
gdb ./myapp core.12345
No need to halt service.
Useful Pretty-Printing
GDB's default print of a pthread_mutex_t is unreadable bytes. Use libthread-db's helpers and write your own pretty-printers in Python for your data structures. A 30-line printer for your queue type pays for itself within an hour.
(gdb) python
class QueuePrinter:
def __init__(self, val): self.val = val
def to_string(self):
return "queue head=%d tail=%d size=%d" % (
self.val['head'], self.val['tail'], self.val['size'])
end
Common Multi-thread Bug Patterns
Frozen process, no CPU usage: deadlock. Run thread apply all bt and look for the cycle.
Frozen process, 100% CPU: livelock or busy-spin. Same backtrace command — you'll see threads in a tight loop or pthread_cond_wait returning immediately.
Crash with corrupted state: race writing the state. TSan first, watchpoint second.
Bug disappears under GDB: timing-sensitive race. Switch to non-stop mode, or use rr for deterministic replay.
Different stack traces every run: lock-free code with insufficient memory ordering. Reach for the memory-barriers article or relacy model checker.
Takeaways
thread apply all btis the first command in any multi-threaded freeze.- Name your threads. Untagged threads are useless under load.
- Use non-stop mode when freezing all threads distorts the bug.
- TSan finds races; GDB inspects them. Don't confuse the roles.
rris worth installing for the day you have a non-deterministic bug. That day always comes.