Format String Attacks Demystified

Format string vulnerabilities happen when user-controlled input ends up as the first argument to printf(). Instead of printing text, the attacker reads or writes arbitrary memory.

We demonstrate reading the stack with %08x specifiers, then escalate to an arbitrary write using %n. The write-what-where primitive turns a seemingly harmless logging call into full code execution.

The fix is trivial: always pass a format string literal. printf("%s", buf) instead of printf(buf). Yet this class of bug resurfaces in embedded firmware to this day.

Why does this still happen? Because logging code is often treated as harmless, copied fast, and reviewed late. In small C projects, developers optimize for speed of implementation and forget that formatting functions are tiny parsers with side effects.

Exploitation ladder

Typical progression in a lab binary:

Leak stack values with %x and locate attacker-controlled bytes.
Calibrate offsets until output is deterministic.
Use width specifiers to control write count.
Trigger %n (or %hn) to write controlled values to target addresses.

At that point, you can often redirect flow indirectly by corrupting function pointers, GOT entries (where applicable), or security-relevant flags.

Defensive pattern

Treat every formatting call as a sink:

enforce literal format strings in coding guidelines
compile with warnings that detect non-literal format usage
isolate logging wrappers so raw printf calls are rare
review embedded diagnostics paths as carefully as network parsers