Multi-Layer Caching Strategy
smo1 intermediate 6 min read
ELI5
SMO1 stores copies of link data in four places, each closer or further from the user. It is like having a photo of your ID: you carry one in your wallet (browser cache), the building security desk has a copy (edge KV), the HR office has the original file (Redis), and the government database has the master record (PostgreSQL). When you update your hairstyle, every copy needs updating — but the wallet photo is fastest to check.
Technical Deep Dive
Four-Layer Cache Hierarchy
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#e8f4f8', 'primaryTextColor': '#2d3748', 'primaryBorderColor': '#90cdf4', 'lineColor': '#718096', 'secondaryColor': '#f0fff4', 'tertiaryColor': '#fefcbf'}}}%%flowchart TB subgraph User["User's Browser"] RC[React Query<br/>Client Cache] end subgraph Edge["Cloudflare Edge"] KV["Cloudflare KV<br/>link:{slug}"] end subgraph Backend["purr-api"] RD[Redis<br/>link URL cache] JW[JWKS Cache<br/>airlock keys] end subgraph Origin["Origin"] PG[(PostgreSQL)] end
U[User] -->|Dashboard| RC RC -->|Stale after 5 min| P1[purr-api] U -->|Short link| KV KV -->|Miss| P2[purr-api] P1 --> RD P2 --> RD RD -->|Miss| PG P1 --> JW P2 --> JW JW -->|Miss| Airlock[Airlock JWKS]Layer 1: React Query (Client-Side)
Where: meow-web browser tab
What: User data, link lists, dashboard stats
TTL / Stale time: 5 minutes (staleTime: 5 * 60 * 1000)
Invalidation: Manual via queryClient.invalidateQueries() after mutations
When a user edits a link, the UI optimistically updates the cache, then re-fetches from purr-api to confirm. This avoids flickering while ensuring consistency.
Layer 2: Cloudflare KV (Edge)
Where: Cloudflare’s global edge network (250+ cities)
What: Link resolution data (url, isActive, expiresAt, utm_*, redirectMode, protectionType)
Key format: link:{slug}
TTL: 300 seconds (5 minutes) for entries written by zoomies-edge on cache miss
Consistency: Eventually consistent (writes propagate globally within ~60 seconds)
Two sources of KV data:
- zoomies-edge write-back — on cache miss, the worker fetches from purr-api and writes to KV with 5-minute TTL
- KV Sync Service — purr-api actively pushes link changes to KV via REST API (see
smo1-015)
Layer 3: Redis (Backend)
Where: purr-api server (or Redis cluster in production) What:
- Link URL cache (same data as KV, but for purr-api internal use)
- Rate limit counters
- JWKS public key cache
Link cache TTL: 300 seconds (5 minutes) JWKS cache TTL: 300 seconds (5 minutes) Rate limit TTL: 3600 seconds (1 hour)
Redis acts as a “hot cache” for purr-api, reducing PostgreSQL query load by ~80% for read-heavy endpoints like link resolution and dashboard stats.
Layer 4: PostgreSQL (Source of Truth)
Where: Primary database (Railway in production, Docker locally) What: All persistent data Cache behaviour: No implicit caching; every query hits disk unless served by PostgreSQL’s own buffer pool
Cache Invalidation Strategy
When a link is updated (e.g., destination URL changed, protection added):
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#e8f4f8', 'primaryTextColor': '#2d3748', 'primaryBorderColor': '#90cdf4', 'lineColor': '#718096', 'secondaryColor': '#f0fff4', 'tertiaryColor': '#fefcbf'}}}%%flowchart LR A[Link updated<br/>via API] --> B[LinkService.Update] B --> C[UPDATE PostgreSQL] B --> D["DELETE Redis link:{slug}"] B --> E["PUT Cloudflare KV link:{slug}"] B --> F[Invalidate React Query<br/>via websocket / polling] C --> G[Source of truth updated] D --> H[Next read re-fills from PG] E --> I[Edge sees new data<br/>within ~60s] F --> J[Dashboard refreshes]Key principle: Write-through to KV + Redis deletion. The next read will:
- Miss in Redis → query PostgreSQL → re-fill Redis
- Find updated data in KV (if sync succeeded) or fall back to API
Stale Data Scenarios
| Scenario | Impact | Mitigation |
|---|---|---|
| KV propagation delay (up to 60s) | Edge may serve old URL briefly | Acceptable for most use cases; critical updates use custom slug change |
| Redis expiry race condition | Two requests both miss and query PG | No data loss; minor PG load spike |
| React Query stale data | User sees old link list for up to 5 min | Manual invalidation on mutation; optimistic updates |
Key Terms
- TTL → Time-To-Live; seconds until a cache entry expires automatically
- Write-through → Writing to cache and database simultaneously on update
- Cache invalidation → Deleting or updating cached entries when source data changes
- Eventually consistent → KV property: reads may return slightly old data after a write
- Optimistic update → UI assumes the mutation succeeds and updates cache immediately, rolling back on error
Q&A
Q: Why not use a single cache layer? A: Each layer serves a different purpose. React Query reduces API calls from the browser. KV reduces origin latency globally. Redis reduces database load. PostgreSQL ensures durability. Removing any layer would create a bottleneck.
Q: What is the maximum staleness a user can experience? A: Worst case: KV propagation delay (~60s) + React Query stale time (5 min) = ~6 minutes. In practice, link updates trigger immediate React Query invalidation, so dashboard users see changes within seconds.
Q: How does the system handle a link deletion?
A: The link is soft-deleted (is_active = false). KV and Redis entries are deleted. The edge worker treats missing or inactive links as “not found” and proxies to the landing page.
Q: Why delete Redis but put KV on update? A: Redis is fast to re-fill (local to purr-api). KV is slow to propagate, so we actively push the new value rather than waiting for the edge worker to discover the deletion and re-fetch.
Examples
Think of caching like a city’s water supply:
- React Query is the water tank on your roof — instant pressure, but only holds a small amount and needs refilling
- Cloudflare KV is the neighbourhood water tower — shared by many houses, refilled from the main plant, and takes a few minutes to update when the city switches reservoirs
- Redis is the pumping station — pressurises water for the neighbourhood and reduces load on the main pipes
- PostgreSQL is the reservoir and treatment plant — the ultimate source, but too far away to rely on for every glass of water
- Cache invalidation is the city switching water sources: they update the treatment plant, flush the pumping station, and fill the towers with new water — but the roof tank on your house might still have old water until you drain it
neighbors on the map
- Edge Redirect Flow debugging why a slug does not redirect
- Click Tracking Pipeline debugging missing or duplicate click counts
- Database Architecture designing a query that spans transactional and analytical data
- Tier-Based Rate Limiting debugging 429 errors for specific users