CRUMB a card from devarno-cloud

Timeline Analysis Saga

skyflow intermediate 7 min read

ELI5

When a user hits POST /api/v1/timeline/analyze, Core API doesn’t do the work itself. It publishes events.timeline.requested to NATS and immediately replies 202 Accepted. The Python Timeline Service consumes that event, fetches Bluesky posts, runs the algorithm, writes results, then publishes events.timeline.completed — which Gamification picks up to award XP and Realtime Gateway picks up to push a notification.

Technical Deep Dive

End-to-End Sequence

sequenceDiagram
autonumber
participant C as Client
participant API as core-api
participant N as NATS (TIMELINE)
participant TL as timeline-service (Py)
participant BS as Bluesky / AT Proto
participant PG as Postgres
participant RD as Redis
participant G as NATS (GAMIFICATION)
participant GAM as gamification-service
participant RT as realtime-gateway
C->>API: POST /api/v1/timeline/analyze {did, type}
API->>API: tier check + rate limit (Redis)
API->>N: publish events.timeline.requested
API-->>C: 202 Accepted {analysis_id, estimated_completion}
N->>TL: deliver (work_queue, max_deliver=3)
TL->>RD: GET analysis:{did}:{type}:{date}
alt cache hit
RD-->>TL: cached TimelineAnalysis
else cache miss
TL->>BS: get_author_feed (token-bucket rate-limited)
BS-->>TL: posts batch
TL->>TL: calculate_flow / clout / kudos / pulse
TL->>PG: INSERT timeline_analyses
TL->>RD: SET analysis:... TTL=1h (DRIFT) / 30m (LIFT+)
end
TL->>N: publish events.timeline.completed
TL->>N: ACK requested msg
N->>GAM: deliver completed (consumer group)
GAM->>PG: INSERT xp_transactions (+10 × multiplier)
GAM->>G: publish events.xp.earned
GAM->>G: publish events.realtime.xp_earned
G->>RT: deliver realtime
RT->>C: SSE event: xp_earned

State of an Analysis Row

flowchart LR
Q["requested<br/>(NATS only)"] --> P["processing<br/>in-flight"]
P --> C["completed<br/>row in timeline_analyses"]
P --> F["failed<br/>events.timeline.failed"]
C --> X["XP awarded<br/>xp_transactions row"]

Tier-Aware Cache TTL (services/02-timeline-service.md § 4.1)

TierCache TTL
DRIFT60 minutes
LIFT / JET / ORBIT30 minutes

The shorter TTL on paid tiers ensures fresher data for paying users, at the cost of more Bluesky API calls (rate-limited via internal token bucket of 5000 req/h).

Failure Modes

Where it breaksSymptomRecovery
Bluesky 5xxevents.timeline.failed, NACK on requested msgNATS redelivers up to 3× then DLQ
Postgres write failsanalysis lost; requested msg NACK’dredelivery; idempotent re-analysis
Gamification downXP not awarded; completed msg accumulates in consumer pendingdrain when service returns; max_ack_pending=100 backpressure

Key Terms

  • Saga → multi-step async workflow coordinated by event hand-off rather than RPC orchestration
  • work_queue stream → TIMELINE deletes a message once any consumer group acks it; the analysis runs once
  • Token bucket → in-process rate limiter for outbound Bluesky API calls (5000 req/h)
  • estimated_completion → returned in the 202 response; client may poll GET /api/v1/timeline/analyses/:id or wait for SSE

Q&A

Q: Why does the API return 202 instead of 200 with the result? A: Analysis fetches up to 500 posts from Bluesky (multi-second). Returning 202 lets the client subscribe to SSE or poll for the completed row, keeping HTTP latency under the 200ms p95 budget for core-api.

Q: What guarantees XP is awarded exactly once? A: Two layers. (1) Timeline publishes events.timeline.completed with event_id matching the analysis_id; Gamification dedupes on event_id. (2) xp_transactions is append-only — even a duplicate would be detectable by querying WHERE event_id = $1.

Q: What happens to the cache hit path? A: Even on cache hit, Timeline still publishes events.timeline.completed so Gamification awards XP and the user sees their reward — the cache only short-circuits the Bluesky fetch + algorithm.

Q: How is the SSE notification routed back to the right user? A: events.realtime.xp_earned carries user_id. Realtime Gateway’s connection manager looks up open SSE channels for that user and writes the event: xp_earned frame.

Examples

Like ordering a custom-printed t-shirt online: the website (core-api) confirms the order instantly (202), the print shop (timeline-service) does the slow work, then a tracking notification (SSE xp_earned) hits your phone the moment the parcel ships.

neighbors on the map