How Redis actually deletes expired keys Spoiler: not at TTL=0. Not even close. Lazy expiration checked on access, deleted then Active expiration background sampler, every ~100ms maxmemory eviction last-resort, when RAM is full Three mechanisms. Two of them run all the time. One only when you're already in trouble. Why a key with TTL=0 can sit in memory for minutes — and what that means for your cache. A senior-engineer walkthrough of how the world's most popular cache cleans up after itself.

Here's a thing that surprises people the first time their Redis bill goes up: setting a TTL on a key does not mean the key disappears at TTL=0. Redis is more pragmatic than that. The key stays in memory until either (a) someone accesses it and gets told "yeah, this is gone" while Redis silently deletes it, or (b) a background sampler happens to roll the dice on it. This blog walks through the two strategies Redis uses, why the hybrid exists, the math behind the 25% threshold, what happens on replicas and in cluster mode, and the handful of tuning knobs that actually matter.

Why this is even a question

The naïve mental model is "Redis stores a TTL with each key, runs a timer, deletes the key on expiry." That sounds clean. It's also a terrible idea at any meaningful scale.

Imagine 50 million keys, each with its own TTL. The timer-per-key model means 50 million scheduled events fighting for kernel timers, or one giant priority queue that you have to walk on every tick. CPU goes up, latency goes up, the event loop blocks. Redis is single-threaded for command processing — the one thing you absolutely cannot do is spend the whole tick scanning expiry metadata.

So the design choice was: don't try to be precise about when a key dies. Be precise about what reads return, and clean up in the background at a rate you can control. That's the whole insight. Everything else is mechanism.

The contract Redis actually gives you: "If you ask for an expired key, you'll never see its value." It does not promise "the key is removed from memory at TTL=0." Two different guarantees, one of them much cheaper to implement.

Strategy 1 — Lazy (passive) expiration

Every command that touches a key does an expiry check first. GET foo, HGET user:42 name, EXISTS session:abc — all of these go through the same lookup path. The lookup checks: does this key exist? If yes, is its TTL expired? If expired, delete it now, return as if it never existed. Otherwise, serve the value.

Lazy expiration: the key dies the moment someone touches it GET foo client command lookup foo in dict key found TTL stored separately expired? check now() vs stored expire-at cheap integer compare not expired → return value expired → DEL + return nil Cost per lookup: one extra integer compare. That's it. Problem: if a key is never accessed again, lazy expiration never fires. Dead keys can sit in RAM forever, paying full price for memory you can't use. This is why "set TTL and forget" is a memory leak waiting to happen.

Lazy is essentially free — the expiry check is just an integer comparison against the stored expire-at timestamp. The cost only kicks in if a key actually is expired, in which case Redis deletes it inline (or asynchronously via UNLINK-style lazy free, depending on the operation type and config).

The catch is the obvious one: if a key never gets accessed again, lazy expiration never runs on it. A user who sets one-million session keys with 30-day TTL and then half the users churn? Half a million dead keys, sitting in RAM, taking up bytes that Redis thinks are still "valid usage" — until something touches them. Without a second mechanism, this is a slow-burning memory leak.

Strategy 2 — Active (proactive) expiration

This is the part that surprises people. Redis runs a background task — at 10 Hz by default, configurable via hz — that proactively scans for expired keys. It's not a full scan. That would be insane. It's a random sampler with a feedback loop:

Active expiration: random sample → delete → repeat if too many were dead 1. Sample 20 keys randomly, from the "keys with TTL" set 2. Delete the expired count how many out of 20 were dead 3. > 25% expired? yes → loop back to (1) no → stop, wait next tick if > 25% were dead, run another iteration in the same tick Why 20 keys and 25%? Probability and bounded work. Bounded CPU per cycle Each iteration touches ~20 keys. Cycle has a hard time budget (default ~25% of one tick). Latency stays predictable. Probabilistic guarantee In steady state, < 25% of "keys with TTL" are dead but still in memory. Higher hz = tighter bound, more CPU. Redis 6+ adds the active-expire-effort knob (1–10) to push the threshold tighter at the cost of CPU.

The algorithm in plain words:

every 1000/hz milliseconds (default 100ms):
  start_time = now()
  loop:
    sample 20 random keys from the "keys with TTL" set
    expired_count = 0
    for each sampled key:
      if expired:
        delete it
        expired_count += 1

    if expired_count / 20 <= 25%:
      break              # mostly clean, stop for now
    if elapsed(start_time) > cycle_time_budget:
      break              # ran out of time, yield to the event loop
    # else: keep sampling — there are probably more dead keys

Why these specific numbers? The 25% threshold is the steady-state guarantee. If, statistically, the random sample shows fewer than 25% expired keys, Redis concludes "the proportion of dead keys with TTL in the keyspace is currently below ~25%, my work here is done for this tick." If it's above 25%, there's enough garbage to justify another sample-and-delete iteration. The feedback loop self-tunes to the workload — quiet keyspaces use almost no CPU; bursty TTL workloads automatically run more iterations to catch up.

"Keys with TTL" matters here. Redis maintains a separate dictionary of just the keys that have expirations set. Active expiration samples from that set, not from the entire keyspace. So a database with 100 million keys but only 1 million TTL'd keys runs the sampler against the 1 million — much cheaper.

Why both? The hybrid is the whole point

Either strategy alone fails:

  • Lazy alone: if a key is never accessed after expiry, it sits in RAM until the heat death of the universe. For workloads where many keys are written-then-forgotten (sessions, idempotency tokens, rate-limit counters, one-time tokens), this is a memory leak.
  • Active alone: would have to be aggressive enough to keep RAM clean, which means scanning a lot. CPU bomb on big keyspaces, especially when most keys aren't even close to expired.

The hybrid solves both:

  • Lazy handles the hot keys — they get touched anyway, free expiry checks come along for the ride.
  • Active handles the cold dead keys — the ones nobody is asking for. The probabilistic sampler keeps the dead-key fraction bounded without ever needing a full scan.
Hybrid: lazy covers the hot path, active covers the cold tail Hot keys (frequently accessed) → lazy expiration handles them on the very next GET / HGET / etc. Cost: 1 integer compare per lookup Latency: imperceptible Cold keys (never accessed again) → active sampler eventually catches them probabilistically, over many ticks Cost: bounded by hz + cycle budget Memory: bounded ~25% overshoot Either path on its own breaks. Together they cover the whole distribution.

The hz parameter — what it actually controls

The hz config option controls how many times per second Redis runs its background tasks. Default is 10. Range is roughly 1–500. It's not just expiration — hz also drives client timeout checks, cluster bus tasks, and a few other periodic things — but expiration is the most visible one.

hz valueEffectWhen to use
1–5Less CPU spent on background tasks. More dead keys lingering. Slower client timeout detection.Tiny instances, very cold keyspaces, no TTLs.
10 (default)Sweet spot. Active expiration runs every 100ms, ~25% dead-key bound holds in steady state.Default for almost everyone. Don't touch unless you have a reason.
50–100Active sampler runs more often. Tighter dead-key bound. More CPU spent in background tasks.TTL-heavy workloads (sessions, rate limits) where memory bloat from cold keys is hurting you.
100+Diminishing returns. Background CPU competes with command execution. Latency p99 can creep up.Rare. Measure before cranking it.

Redis 6+ added active-expire-effort as a separate dial (1 to 10) that scales the active-expire cycle's aggressiveness independently of the global hz. If you're trying to be tighter on dead-key memory but don't want to make every other periodic task run faster, this is the better lever.

DEL vs UNLINK — what "delete" actually means under the hood

Once Redis has decided a key is expired, the next question is how to free it. For a SET foo bar with a 5-byte value, deletion is a memcpy and a free — instantaneous. For a HSET user:1 ... with a million fields, freeing the underlying hash table is potentially milliseconds of CPU. Doing that synchronously on the event loop blocks every other client.

This is what UNLINK (Redis 4.0+) is for. UNLINK removes the key from the keyspace dictionary immediately (so reads no longer see it), but the actual memory free is queued to a background thread. The event loop unblocks instantly; the heavy reclamation runs off-thread.

DEL vs UNLINK: same observable behavior, very different blocking DEL bigkey (synchronous) unlink from dict + free memory NOW on the event loop thread → blocks for O(N) where N = key size Million-field hash? Hello, latency spike. UNLINK bigkey (async free) unlink from dict NOW (cheap) free queued to background thread → event loop unblocks immediately Same correctness. No latency spike.

For active expiration of big keys, Redis 4.0+ gates this behind lazyfree-lazy-expire. Default in modern versions is yes — meaning expirations of big collections use UNLINK-style lazy free automatically. Before that flag, a single huge expired hash could pause your Redis for tens of milliseconds. After it, you don't notice.

If you're on a 3.x or early 4.x Redis and you have collection keys with large element counts under TTL — upgrade or set the lazy-free flags. A single active-expire cycle hitting a multi-million-element set can cause a visible event-loop pause. This used to be the most common "Redis is slow tonight" incident.

Replicas — the part that breaks people's mental model

This is the one that usually shows up in postmortems. Replicas do not run their own active expiration on the user-facing keyspace (in the default configuration). The reasoning: if both the primary and replica independently decide to expire keys, they'll diverge — different sample order, different deletion order, the replica's view drifts from the primary's. So Redis's design is:

  • The primary runs lazy + active expiration as described.
  • When the primary deletes an expired key, it propagates a synthetic DEL to all replicas via the replication stream.
  • The replica applies the DEL just like any other write — that's when the key actually disappears from the replica's memory.
Expiration on a primary-replica pair Primary runs lazy + active expiration decides: foo expired → DEL foo locally → append "DEL foo" to repl stream single source of truth DEL foo replication stream Replica does NOT run active expiry on the user keyspace applies DEL when it arrives until then: foo still in RAM may briefly serve "expired" data on read Modern Redis filters this on read — replicas check expiry on lookup so clients don't see stale values, even if the key is still allocated.

Two practical consequences:

  • The replica's used_memory can be higher than the primary's for short windows. The primary has already DEL'd the key locally; the replica is still waiting for the replication packet (or processing a backlog). Don't panic if the numbers don't match exactly.
  • Reads on a replica may see "expired-looking" keys if you're reading raw — but modern Redis applies an expiry check on lookup even on replicas, so client-visible reads won't return expired values. The memory footprint can lag, the user-visible answer will not.

Cluster mode — same mechanism, sharded

In Redis Cluster, each node owns a subset of hash slots. Active expiration runs per node, on the keys that node is the primary for. Replicas of those slots follow the same DEL-propagation rule above. There's no global expiration coordinator — that would be a scalability disaster. Each node sweeps its own slots, period.

Practical implication: your "dead key" memory pressure is per-node, not cluster-wide. A skewed workload that puts most TTL'd keys on one slot will mostly tax that one node's expiration sampler.

Where TTL stops being enough — maxmemory + eviction

Here's the subtle thing senior engineers get bitten by: TTL is not a memory budget. It's a hint about when a value becomes invalid. If your write rate exceeds your expiration rate (lazy + active combined), memory grows. When memory hits maxmemory, Redis stops doing TTL expiration as the primary cleanup mechanism and switches to eviction based on your configured policy.

maxmemory-policyWhat it doesWhen to use
noevictionReject writes with OOM error. No keys deleted.Source of truth, never want silent data loss.
allkeys-lruEvict least-recently-used across all keys.Pure cache. TTLs are bonus, not load-bearing.
allkeys-lfuEvict least-frequently-used across all keys.Cache where access frequency matters more than recency.
volatile-lruEvict LRU only among keys with TTL set.Mixed workload: persistent + expiring keys, want to protect persistent.
volatile-ttlEvict the key with the soonest TTL first.You want "things closer to dying go first." Subtle pitfalls — measure.
volatile-randomRandom victim from the TTL set.Rarely the right answer. Cheap, predictable bad behavior.
The classic gotcha: you set TTLs on every key, set maxmemory-policy noeviction "to be safe," and then your traffic doubles. Active expiration can't keep up, memory hits the cap, writes start failing with OOM errors, and the dashboard shows you have plenty of "expired" keys still in RAM. The keys had TTL. Redis just hadn't gotten around to deleting them yet. Your cache was, in effect, a wall.

What you actually monitor and tune

Metric (from INFO)What it tells youWhat to do
expired_keysTotal keys deleted by expiration since startup.Watch the rate; sudden drops can signal active-expire stalling.
used_memory vs used_memory_datasetTotal RAM vs the part holding actual data.Big gap = fragmentation or dead-key buildup. Investigate.
db0:keys vs db0:expiresTotal keys, and how many have a TTL set.Sudden growth in expires with flat expired_keys = sampler is behind.
Replication lagReplica is processing DELs late.Replicas hold expired-but-not-yet-DEL'd keys longer; expect transient memory difference.
Slow log entriesBig DEL or expire-time freeing causing slowdowns.Confirm lazyfree-lazy-expire yes. Check for unusually large keys.

The handful of config flags that matter:

# redis.conf — the expiration-relevant knobs

hz 10                               # default. crank to 50–100 for TTL-heavy workloads.
                                    # affects all background tasks, not just expiration.

active-expire-effort 1              # Redis 6+. range 1–10. higher = tighter dead-key bound,
                                    # more CPU. independent of hz.

lazyfree-lazy-expire yes            # async free for expired big keys. should be ON in modern
                                    # configs. without it, big-collection expiry blocks the loop.

lazyfree-lazy-eviction yes          # same, but for eviction (when maxmemory hits).

lazyfree-lazy-server-del yes        # async free for explicit DEL of big keys.

maxmemory 8gb                       # the actual safety net. set this.
maxmemory-policy allkeys-lru        # what happens when you hit it. pick deliberately.

Closing take

The mental model worth keeping is this: a Redis TTL is a contract about when reads start returning nil. It is not a contract about when memory is freed. The lazy + active hybrid makes the trade-off explicit — Redis spends almost no CPU on expiration in steady state, and accepts a bounded amount of memory overhead from dead keys that haven't been swept yet.

Where this matters: when you're sizing memory, when you're tuning hz for TTL-heavy workloads, when you're reading replicas and the numbers don't match, when you're staring at a "Redis is full but most of these keys should be expired" incident at 2 AM. Every one of those gets clearer once you internalize that the deletion is opportunistic, not punctual.

Set the right maxmemory, pick an eviction policy that matches the role (cache vs source-of-truth), enable lazy-free for big collections, and let the sampler do its job. That's the whole pattern.

If you remember one thing: a TTL of 60 seconds doesn't mean the key is gone in 60 seconds. It means nobody will see the value after 60 seconds. The actual memory free happens lazily, on access or sample, whichever comes first. Build your capacity model on that, not on the wall clock.