Web Request, DNS, and CDN Caching Reference Guide¶

DNS Resolution¶

DNS (Domain Name System) translates human-readable domain names into IP addresses so browsers can reach the correct server.

Resolution Process¶

Browser/OS cache: Checks if the IP is already known.
Local DNS resolver: If not cached, the browser queries the resolver provided by the ISP.
Recursive lookups:
Root servers: Point to the appropriate TLD server (e.g., .com, .net).
TLD servers: Direct to the authoritative servers for the domain.
Authoritative servers: Provide the actual IP address of the domain.

Key Points¶

ICANN oversees global domain management.
Registries manage TLDs; registrars sell domains to end users.
Name servers can be operated by hosting companies, CDNs, or managed DNS providers.

DNS Records¶

A/AAAA: Maps domain/subdomain directly to an IP.
CNAME: Maps a domain to another domain name, not an IP.
Name servers for CDNs or hosting often use CNAMEs, while direct VPS IPs use A/AAAA records.

Diagram: DNS Resolution Flow¶

User Browser
    |
    v
OS/Browser Cache
    |
    v
Local DNS Resolver
    |
    v
Root Server -> TLD Server -> Authoritative Server
    |
    v
Return IP to Browser

Origin Servers and CDN Basics¶

Origin server: Your VPS or main web server where content resides.
CDN (Content Delivery Network): Network of edge servers that cache content globally.
Benefits of using a CDN:
Reduced latency
Load balancing
Failover support

Edge Servers¶

Independent servers with local caches.
Serve content to users closest to them.
Distributed caching means a cache miss at one edge doesn't imply a global miss.

Multi-Origin Setup¶

CDNs can be configured with multiple origins.
Edge servers choose the best origin based on:
Geography (closest origin)
Latency measurements
Availability/failover

Stale-While-Revalidate (SWR)¶

Allows an edge to serve stale content while asynchronously fetching fresh content.
Improves user experience but does not reduce total origin requests.

Diagram: CDN Edge Request Flow¶

User Browser
    |
    v
CDN Edge Server
    |-- Cache Hit --> Serve Content
    |-- Cache Miss --> Fetch from Origin
        |-- Primary Origin Available? --> Yes: Serve & Cache
        |-- Primary Origin Down --> Alternate Origin: Serve & Cache

CDN Edge Nodes and Caching Behavior¶

Edge nodes: Separate physical/virtual servers distributed globally.
Each node caches content independently.
Cache hit rate ~85% is typical due to:
Distributed cache architecture
Invalidations and purges
Low TTL content
Dynamic routing to different nodes
Multi-origin fallback ensures content delivery even if one origin is unavailable.

Cache Lifecycle¶

Cache miss triggers fetch from origin.
Content is stored in the edge node cache.
Subsequent requests served from cache until TTL expires or cache is invalidated.

Diagram: Distributed Edge Cache Example¶

User A (London) --> Edge Node 1: Cache Miss --> Origin (UK) --> Cached at Node 1
User B (London) --> Edge Node 2: Cache Miss --> Origin (UK or fallback) --> Cached at Node 2
Edge caches operate independently; one node's cache doesn't affect others

Surrogate Keys¶

Surrogate keys: Labels/tags assigned to cached objects for targeted invalidation.
Multiple keys per object allowed.
Useful for grouping content logically (e.g., category, author, event).
Invalidation by key allows selective cache purges without affecting unrelated content.

Surrogate Key Management¶

Granularity: Avoid too broad or too narrow keys.
Naming conventions: Consistency is crucial for automation.
Automation: Dynamically generate keys based on content metadata.
Multi-origin/multi-edge consistency: Ensure keys match across all origins and edge nodes.
Interacts with SWR and TTL to maintain freshness efficiently.

Diagram: Surrogate Key Cache Invalidation¶

Edge Cache
|-- Object: Article-123 [Keys: article-123, sports, football]
|-- Object: Article-124 [Keys: article-124, sports]

Invalidate 'sports' --> Both Article-123 & Article-124 refreshed from origin
Invalidate 'article-123' --> Only Article-123 refreshed