Part 1 covered what CDNs are and why you need them. Part 2 covers how they actually work: request flow, multi-tier architecture, and caching decisions.
Request Flow
Cache HIT (Fast Path)
User requests cdn.example.com/logo.png
Assume: User ↔ Edge RTT = 30ms
1. DNS lookup - 30ms (1 RTT)
User → DNS: query [15ms]
DNS → User: edge IP [15ms]
2. User ↔ Edge: TCP handshake - 45ms (1.5 RTT)
User → Edge: SYN [15ms]
Edge → User: SYN-ACK [15ms]
User → Edge: ACK [15ms]
3. User ↔ Edge: TLS handshake - 30ms (1 RTT, TLS 1.3)
User → Edge: ClientHello [15ms]
Edge → User: ServerHello + cert [15ms]
4. HTTP request + cache hit - 16ms
User → Edge: GET /logo.png [15ms]
Edge checks cache → HIT [1ms local]
5. Edge → User: HTTP response - 15ms
Edge → User: 200 OK + image data [15ms]
Total: ~136ms
Cache MISS (Slow Path)
User requests cdn.example.com/video.mp4
Assume: User ↔ Edge RTT = 30ms, Edge ↔ Origin RTT = 200ms (cross-continent)
1. DNS lookup - 30ms (1 RTT)
User → DNS: query [15ms]
DNS → User: edge IP [15ms]
2. User ↔ Edge: TCP handshake - 45ms (1.5 RTT)
User → Edge: SYN [15ms]
Edge → User: SYN-ACK [15ms]
User → Edge: ACK [15ms]
3. User ↔ Edge: TLS handshake - 30ms (1 RTT, TLS 1.3)
User → Edge: ClientHello [15ms]
Edge → User: ServerHello + cert [15ms]
4. User → Edge: HTTP request - 16ms
User → Edge: GET /video.mp4 [15ms]
Edge checks cache → MISS [1ms local]
5. Edge ↔ Origin: TCP handshake - 300ms (1.5 RTT)
Edge → Origin: SYN [100ms]
Origin → Edge: SYN-ACK [100ms]
Edge → Origin: ACK [100ms]
6. Edge ↔ Origin: TLS handshake - 200ms (1 RTT)
Edge → Origin: ClientHello [100ms]
Origin → Edge: ServerHello + cert [100ms]
7. Edge → Origin: HTTP request + response - 410ms
Edge → Origin: GET /video.mp4 [100ms]
Origin processes request [10ms]
Origin → Edge: 200 OK + 10MB video data [100ms + 200ms transfer]
8. Edge writes to cache - 5ms local
9. Edge → User: HTTP response - 115ms
Edge → User: 200 OK + video data [15ms + 100ms transfer]
First request total: ~1,151ms
Subsequent requests: ~136ms (steps 1-4 only, cache HIT)
Flow Diagram
User → DNS → Edge Server
├─ Cache HIT (~136ms)
└─ Cache MISS → Parent CDN
├─ Cache HIT (~600ms)
└─ Cache MISS → Origin (~1,151ms)
Multi-Tier Architecture
Why Multiple Tiers?
Single-tier problem: 200 edges miss simultaneously → 200 parallel requests slam origin → dead.
Multi-tier solution: 200 child edges → 10 parent CDNs. Parents de-duplicate → 10 requests to origin (not 200). 95% origin load reduction. Parent tier called Origin Shield (AWS/Cloudflare).
Math
1M requests/day:
├─ Child hit (90%): 900K served from edge
├─ Parent hit (85% of 10%): 85K served from parent
└─ Origin hit: 15K (1.5% of total)
Without parent: 100K origin requests
With parent: 15K origin requests
Reduction: 85%
The Three Tiers
Child CDN (Edge): 200+ globally (CloudFront 450+). 1-10 TB storage. Hot content (top 10-20%). 85-95% hit ratio.
Parent CDN (Origin Shield): 10-20 globally (AWS has 13). 50-500 TB storage. Long tail content. 80-90% hit ratio on child misses.
Origin: 1 server/cluster. Unlimited storage. Gets hit only on double miss (~1-2% of total).
Edge Server Selection
DNS-Based Geo-Routing: User queries DNS → GeoIP maps IP to location → returns nearest edge IP. "Nearest" = geographic distance + network topology + measured latency + server load.
DNS TTL problem: DNS cached 60-300s. Edge dies → users hit dead edge until TTL expires. Trade-off: lower TTL = faster failover, more DNS queries.
Anycast (alternative): Multiple edges share same IP (1.1.1.1). BGP routes to nearest. Faster failover, no DNS delay. Used by Cloudflare, Google.
Health checks: CDN pings edges. Dead server removed from DNS pool. Existing cached DNS still affected (TTL delay).
Caching Strategies: Pull vs Push
Pull (Lazy): Fetch on-demand. Edge starts empty → cache miss → fetch from origin → cache locally. Efficient storage, scales to unlimited catalog. First request slow, origin sees more traffic. Best for: long tail, user-generated content.
Push (Proactive): Pre-position content. Origin pushes to all edges before requests. Zero latency, predictable. Wastes storage/bandwidth on unpopular content. Best for: top 1-20% popular, static assets, new releases.
Hybrid (reality): 80/20 rule - PUSH top 20% (covers 80% requests), PULL bottom 80% (covers 20% requests). Netflix: PUSH new releases/trending, PULL obscure documentaries. E-commerce: PUSH homepage/bestsellers, PULL user reviews/long-tail.
Decision Matrix
| Content Type | Popularity | Change Frequency | Strategy |
|---|---|---|---|
| Company logo | High | Never | PUSH |
| Homepage CSS | High | Weekly | PUSH |
| Product image | Medium | Rare | PUSH (bestseller) / PULL (not) |
| User avatar | Low | Often | PULL |
| API response | N/A | Real-time | Don't cache |
Cache Key Design
Unique identifier for cached content. Determines if two requests get same content. Simple: cache_key = URL. Complex: include headers (device type, encoding, language).
def generate_cache_key(url, headers):
key_parts = [url]
for header in ['Accept-Encoding', 'User-Agent']:
if header in headers:
key_parts.append(f"{header}:{headers[header]}")
return hashlib.md5('|'.join(key_parts).encode()).hexdigest()
Pitfalls: Too specific → low hit ratio. Too broad → serve wrong content (mobile gets desktop). Query params: image.jpg?user=123 → don't cache (per-user). image.jpg?v=2 → cache (versioning).
Key Takeaways
- Multi-tier: 90% × 85% = 1.5% origin hits (Origin Shield de-duplicates requests)
- DNS routing + health checks = automatic failover (TTL delay caveat)
- Hybrid wins: PUSH popular (20%), PULL long tail (80%)
- Cache keys determine correctness - balance specificity vs hit ratio