brk, sbrk, mmap: what malloc hides

You call malloc(256). You get a pointer back. You use it. You call free(). Done.

Except no. Between your call and the physical memory, there are three system calls, an allocator of roughly 6,000 lines of code (glibc), and the kernel’s virtual-memory subsystem. Understanding this stack is understanding why some programs consume three times more memory than you’d expect, why free doesn’t always return memory to the system, and why a custom allocator can be an order of magnitude faster.

The heap and brk

Historically, the heap of a Unix process is a contiguous region that starts right after the BSS segment (uninitialized data) and grows toward higher addresses. The top of that region is called the “program break.”

The brk(address) syscall moves that top. sbrk(increment) moves it by a relative number of bytes and returns the previous position. It’s the oldest memory-allocation mechanism on Unix.

void *block = sbrk(4096);  // advance the program break by 4096 bytes

The kernel doesn’t do anything spectacular: it updates an internal variable (mm->brk), creates or extends a virtual-memory area (VMA), and returns. No physical page is allocated at that stage. The kernel uses lazy allocation: the physical page is only attributed on first access, via a page fault.

brk is fast (no search, no complex structure) but rigid: the heap can only grow or shrink from the top. You cannot free a block in the middle.

mmap: the alternative

For larger allocations, malloc uses mmap instead of brk. The call mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) asks the kernel to create a new virtual-memory area, decoupled from the heap.

The upside: each mmap allocation is independent. When you free it with munmap, the memory is returned to the system immediately. No heap fragmentation, no dependency on the program break position.

The downside: mmap costs more than brk. Each call creates a new VMA in the kernel, updates the red-black tree of memory regions, and potentially invalidates TLB entries. On thousands of small allocations, that fixed cost is prohibitive.

The M_MMAP_THRESHOLD switch

glibc chooses between the two mechanisms with a threshold: M_MMAP_THRESHOLD. The default is 128 KB (131,072 bytes) and it self-tunes dynamically.

Below the threshold: malloc uses the heap (managed via brk/sbrk), with its internal structures (bins, arenas, tcache).
Above the threshold: malloc calls mmap directly. The block is isolated; free will call munmap.

You can force the threshold with mallopt(M_MMAP_THRESHOLD, value), but in practice glibc self-tunes well. What matters is knowing the threshold exists, because it explains some surprising behavior.

Why free doesn’t return memory

When you free a block allocated via the heap (brk), glibc marks it free in its internal structures. But it does not shrink the heap immediately. The block stays reserved, in case a future malloc needs a similarly sized one.

The heap only shrinks if the freed block is at the top and exceeds a certain threshold (configurable via M_TRIM_THRESHOLD, default 128 KB). And even then, brk can only shrink from the top: if a small block still sits at the top of the heap, everything below stays reserved — even if it’s 99 % free memory.

That’s the fundamental mechanism of an “apparent leak”: your program has freed 90 % of its memory, but top shows constant consumption. The memory is free for malloc but not returned to the system.

Blocks allocated via mmap, on the other hand, are returned immediately by munmap. That’s why large allocations don’t cause this problem.

glibc’s internal structures

Between brk/mmap and your pointer, glibc maintains a complex machinery:

Arenas: multiple heap regions to reduce contention between threads. The main heap uses brk; secondary arenas use mmap.
Bins: free-block lists classified by size. Fastbins (small blocks, no coalescing) serve the most common allocations without a heavy lock.
tcache: a per-thread cache (since glibc 2.26). Each thread has its own free lists, avoiding the arena lock for common cases.
Metadata: every block carries a header (8 or 16 bytes depending on the architecture) with its size and flags. That’s the memory you pay for each malloc, even on a 1-byte block.

What this actually changes

Knowing that malloc uses brk for small blocks and mmap for large ones gives you three levers:

1. Batch small allocations. If you allocate 10,000 blocks of 64 bytes, that’s 10,000 16-byte headers = 160 KB of pure metadata. A bump allocator pre-allocates one block and advances a pointer — zero metadata.

2. Understand memory consumption. If top shows 500 MB and your code only uses 50 MB, it’s probably not a leak. It’s the brk heap not shrinking. Use malloc_stats() or malloc_info() to see the real breakdown.

3. Pick the right release strategy. If you know all your objects will be freed at the same time (request end, phase end), an arena allocator is faster and leaves no residual fragmentation.

These mechanisms are observable in practice with strace, malloc_stats(), and by reading /proc/self/maps.

Natural follow-up: The 4 allocators every C developer should know — bump, pool, free-list, arena: four concrete ways to replace malloc when its general-purpose guarantees become a cost.