CRUMB a card from devarno-cloud

Multi-Layer Caching Strategy

smo1 intermediate 6 min read

ELI5

SMO1 stores copies of link data in four places, each closer or further from the user. It is like having a photo of your ID: you carry one in your wallet (browser cache), the building security desk has a copy (edge KV), the HR office has the original file (Redis), and the government database has the master record (PostgreSQL). When you update your hairstyle, every copy needs updating — but the wallet photo is fastest to check.

Technical Deep Dive

Four-Layer Cache Hierarchy

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#e8f4f8', 'primaryTextColor': '#2d3748', 'primaryBorderColor': '#90cdf4', 'lineColor': '#718096', 'secondaryColor': '#f0fff4', 'tertiaryColor': '#fefcbf'}}}%%
flowchart TB
subgraph User["User's Browser"]
RC[React Query<br/>Client Cache]
end
subgraph Edge["Cloudflare Edge"]
KV["Cloudflare KV<br/>link:{slug}"]
end
subgraph Backend["purr-api"]
RD[Redis<br/>link URL cache]
JW[JWKS Cache<br/>airlock keys]
end
subgraph Origin["Origin"]
PG[(PostgreSQL)]
end
U[User] -->|Dashboard| RC
RC -->|Stale after 5 min| P1[purr-api]
U -->|Short link| KV
KV -->|Miss| P2[purr-api]
P1 --> RD
P2 --> RD
RD -->|Miss| PG
P1 --> JW
P2 --> JW
JW -->|Miss| Airlock[Airlock JWKS]

Layer 1: React Query (Client-Side)

Where: meow-web browser tab What: User data, link lists, dashboard stats TTL / Stale time: 5 minutes (staleTime: 5 * 60 * 1000) Invalidation: Manual via queryClient.invalidateQueries() after mutations

When a user edits a link, the UI optimistically updates the cache, then re-fetches from purr-api to confirm. This avoids flickering while ensuring consistency.

Layer 2: Cloudflare KV (Edge)

Where: Cloudflare’s global edge network (250+ cities) What: Link resolution data (url, isActive, expiresAt, utm_*, redirectMode, protectionType) Key format: link:{slug} TTL: 300 seconds (5 minutes) for entries written by zoomies-edge on cache miss Consistency: Eventually consistent (writes propagate globally within ~60 seconds)

Two sources of KV data:

  1. zoomies-edge write-back — on cache miss, the worker fetches from purr-api and writes to KV with 5-minute TTL
  2. KV Sync Service — purr-api actively pushes link changes to KV via REST API (see smo1-015)

Layer 3: Redis (Backend)

Where: purr-api server (or Redis cluster in production) What:

  • Link URL cache (same data as KV, but for purr-api internal use)
  • Rate limit counters
  • JWKS public key cache

Link cache TTL: 300 seconds (5 minutes) JWKS cache TTL: 300 seconds (5 minutes) Rate limit TTL: 3600 seconds (1 hour)

Redis acts as a “hot cache” for purr-api, reducing PostgreSQL query load by ~80% for read-heavy endpoints like link resolution and dashboard stats.

Layer 4: PostgreSQL (Source of Truth)

Where: Primary database (Railway in production, Docker locally) What: All persistent data Cache behaviour: No implicit caching; every query hits disk unless served by PostgreSQL’s own buffer pool

Cache Invalidation Strategy

When a link is updated (e.g., destination URL changed, protection added):

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#e8f4f8', 'primaryTextColor': '#2d3748', 'primaryBorderColor': '#90cdf4', 'lineColor': '#718096', 'secondaryColor': '#f0fff4', 'tertiaryColor': '#fefcbf'}}}%%
flowchart LR
A[Link updated<br/>via API] --> B[LinkService.Update]
B --> C[UPDATE PostgreSQL]
B --> D["DELETE Redis link:{slug}"]
B --> E["PUT Cloudflare KV link:{slug}"]
B --> F[Invalidate React Query<br/>via websocket / polling]
C --> G[Source of truth updated]
D --> H[Next read re-fills from PG]
E --> I[Edge sees new data<br/>within ~60s]
F --> J[Dashboard refreshes]

Key principle: Write-through to KV + Redis deletion. The next read will:

  1. Miss in Redis → query PostgreSQL → re-fill Redis
  2. Find updated data in KV (if sync succeeded) or fall back to API

Stale Data Scenarios

ScenarioImpactMitigation
KV propagation delay (up to 60s)Edge may serve old URL brieflyAcceptable for most use cases; critical updates use custom slug change
Redis expiry race conditionTwo requests both miss and query PGNo data loss; minor PG load spike
React Query stale dataUser sees old link list for up to 5 minManual invalidation on mutation; optimistic updates

Key Terms

  • TTL → Time-To-Live; seconds until a cache entry expires automatically
  • Write-through → Writing to cache and database simultaneously on update
  • Cache invalidation → Deleting or updating cached entries when source data changes
  • Eventually consistent → KV property: reads may return slightly old data after a write
  • Optimistic update → UI assumes the mutation succeeds and updates cache immediately, rolling back on error

Q&A

Q: Why not use a single cache layer? A: Each layer serves a different purpose. React Query reduces API calls from the browser. KV reduces origin latency globally. Redis reduces database load. PostgreSQL ensures durability. Removing any layer would create a bottleneck.

Q: What is the maximum staleness a user can experience? A: Worst case: KV propagation delay (~60s) + React Query stale time (5 min) = ~6 minutes. In practice, link updates trigger immediate React Query invalidation, so dashboard users see changes within seconds.

Q: How does the system handle a link deletion? A: The link is soft-deleted (is_active = false). KV and Redis entries are deleted. The edge worker treats missing or inactive links as “not found” and proxies to the landing page.

Q: Why delete Redis but put KV on update? A: Redis is fast to re-fill (local to purr-api). KV is slow to propagate, so we actively push the new value rather than waiting for the edge worker to discover the deletion and re-fetch.

Examples

Think of caching like a city’s water supply:

  • React Query is the water tank on your roof — instant pressure, but only holds a small amount and needs refilling
  • Cloudflare KV is the neighbourhood water tower — shared by many houses, refilled from the main plant, and takes a few minutes to update when the city switches reservoirs
  • Redis is the pumping station — pressurises water for the neighbourhood and reduces load on the main pipes
  • PostgreSQL is the reservoir and treatment plant — the ultimate source, but too far away to rely on for every glass of water
  • Cache invalidation is the city switching water sources: they update the treatment plant, flush the pumping station, and fill the towers with new water — but the roof tank on your house might still have old water until you drain it

neighbors on the map