back to home

CDN Design Part 3: Cache Invalidation Strategies

You've deployed a CDN. Your homepage HTML is cached on 200 edge servers across 50 countries. At 9 AM, marketing pushes a critical update: Black Friday sale goes live. How do you ensure a billion users see the new page, not the stale cached version from yesterday?

The Problem

200 edge servers, no shared memory. When origin updates content, edges don't know. Strong consistency requires coordinating all edges (kills availability). Invalidating everything triggers cache miss storm - all edges fetch simultaneously, origin gets 200x traffic spike. Can't have consistency + availability + performance simultaneously.

Four Invalidation Strategies

1. TTL (Time-to-Live)

Origin sets expiration time via Cache-Control: max-age=3600. Edge caches until TTL expires, then fetches fresh version. Update at 10 AM with 1-hour TTL = users see old version until 11 AM.

Key points: Simple (set header, forget). No coordination overhead. Guaranteed staleness window. Can't force immediate updates. Trade-off: short TTL = more origin load, long TTL = longer staleness.

2. Purge (Explicit Invalidation)

API call deletes URL from all edges. Next request = cache miss → fetch from origin. All edges fetch simultaneously = origin storm.

10:01 AM: Call purge API for homepage.html
10:02 AM: All 200 edges delete cache
10:03 AM: 1M requests → 1M cache misses → origin gets 1M req/sec → dead

Key points: Immediate consistency. Full control. Cache miss storm risk - origin must handle 200x spike. Manual intervention. Propagation delay (5-30 seconds).

3. Versioned URLs

Include version in URL: style.css?v=2 or style.abc123.css. Update = new URL. Old URL stays cached, new URL fresh. No invalidation needed.

Old: style.css?v=1 (cached)
New: style.css?v=2 (fresh fetch)
Update HTML to reference v=2. No purge. No storm.

Key points: Zero invalidation overhead. Immutable caching (cache forever). Safe rollbacks. Requires build system changes. Doesn't work for HTML entry points.

4. Soft Purge / Stale-While-Revalidate

Mark content stale, keep serving it. Fetch fresh version in background asynchronously. Next request gets updated content.

Cache-Control: max-age=3600, stale-while-revalidate=86400
After 1 hour: serve stale, fetch fresh in background

10:02 AM: User requests → edge serves OLD (instant)
          → edge fetches NEW in background (async)
10:03 AM: Next user gets NEW
Staleness window: 1 request (~1 minute)

Key points: No cache miss (instant response always). No origin storm (background fetches spread out). Best performance. Temporary inconsistency (1 request sees stale). Not for critical updates.

Trade-offs Table

Strategy Consistency Performance Complexity Cost
TTL Weak (stale until expiry) High (no misses) Low (set header) Low
Purge Strong (immediate) Low (miss storm) Medium (API calls) High (origin load)
Versioned URLs Strong (new URL) High (no misses) High (app changes) Low
Soft Purge Eventual (1 request delay) Highest (serve stale) Medium (CDN support) Low

When to Use What

TTL (default): User profile images, like counts, blog posts. Staleness acceptable, eventual consistency fine.

Purge: Low traffic content, urgent fixes (security patches), strong origin capacity, legal requirements (GDPR).

Versioned URLs: High traffic assets (CSS, JS), immutable content, build-time assets (webpack, vite). SPAs, modern web apps.

Soft Purge: High traffic pages with tolerable staleness (homepage, product pages). News sites, dashboards. Performance + freshness, not critical consistency.

Conclusion

Default to TTL + versioned URLs. TTL handles 80% (set expiration, forget). Versioned URLs solve 20% (immediate updates, no overhead). Reserve purge for emergencies. Soft purge for high-traffic tolerable staleness. Worst choice: manual purges for routine updates. Pick consistency guarantees upfront: strong (financial) → short TTL + purge; eventual (social) → long TTL; immutable (assets) → cache forever.

back to all posts