NATS JetStream Stream Topology
skyflow intermediate 6 min read
ELI5
NATS JetStream is the conveyor belt that carries every Skyflow event. There are six belts (streams), each with its own size and how-long-to-keep policy: USER and GAMIFICATION are big and durable, TIMELINE is a work queue (delete after consumed), REALTIME lives only in RAM for 5 minutes. Subjects are addresses on the belt, like events.user.registered.
Technical Deep Dive
Streams (nats/topology.md)
| Stream | Subjects | Storage | Retention | Max age | Notes |
|---|---|---|---|---|---|
| USER | events.user.> | file | limits | 365d | 1M msgs / 10GB |
| SUBSCRIPTION | events.subscription.> | file | limits | 730d | 2y for compliance |
| TIMELINE | events.timeline.> | file | work_queue | 7d | delete-after-consume; 5MB max msg |
| GAMIFICATION | events.gamification.> events.xp.> events.achievement.> events.streak.> | file | limits | 365d | 5M msgs / 50GB |
| REALTIME | events.realtime.> | memory | limits | 5m | ephemeral push |
| DLQ | dlq.> | file | limits | 30d | failed events for inspection |
duplicate_window per stream: USER 2m, SUBSCRIPTION 5m, TIMELINE 1m, GAMIFICATION 30s, REALTIME 10s. NATS dedupes on the Nats-Msg-Id header within this window.
Stream → Consumer Wiring
flowchart LR subgraph Producers CA[core-api] TL[timeline-service] GAM[gamification-service] end subgraph Streams U[USER] S[SUBSCRIPTION] T[TIMELINE\nwork_queue] G[GAMIFICATION] R[REALTIME\nmemory] D[DLQ] end subgraph Consumers gamU[user-indexer] caS[subscription-processor] tlT[timeline-processor] gamT[timeline-xp-awarder] lbG[leaderboard-updater] acG[achievement-checker] rtR[realtime-gateway] er[event-router] end CA --> U CA --> S CA --> T TL --> T TL --> G GAM --> G GAM --> R U --> gamU S --> caS T --> tlT T --> gamT G --> lbG G --> acG R --> rtR U --> er S --> er T --> er G --> er er --> DConsumer Group Defaults
classDiagram class ConsumerConfig { +string durable_name +string filter_subject +AckPolicy explicit +int max_deliver = 3 +duration ack_wait = 60s +int max_ack_pending = 100 }A durable consumer with the same durable_name across N replicas forms a load-balanced group; events fan out one-per-group, redelivered up to 3 times before the event-router publishes to dlq.<subject>.
Key Terms
- Stream → durable, named append log on JetStream (file or memory backed)
- Subject → topic address inside a stream (
events.user.registered) - work_queue retention → message is deleted after the first consumer acks; only one logical consumer group can exist per subject
- duplicate_window → time during which JetStream rejects re-publishes with the same
Nats-Msg-Id - max_ack_pending → backpressure cap; consumer pauses fetch when this many unacked msgs are outstanding
Q&A
Q: Why does TIMELINE use work_queue retention?
A: Each events.timeline.requested should be processed exactly once by the analysis worker; deleting after ack prevents accidental re-analysis and keeps the stream small despite 5MB messages.
Q: Can multiple consumer groups read from a work_queue stream?
A: No — work_queue allows only one logical consumer per subject. If both Gamification and a future analytics service need timeline events, route through GAMIFICATION (which is limits retention) or split the stream.
Q: What happens when REALTIME hits max_age=5m?
A: Messages older than 5 minutes are dropped. Acceptable because realtime notifications are useless if a client wasn’t connected to receive them in time.
Q: Where do failed events go after max_deliver=3?
A: Event Router catches the terminal failure and republishes onto dlq.<original_subject>, also persisting the protobuf bytes to the dlq_events Postgres table for manual replay.
Examples
Think of streams as parcel-sorting belts at a depot. USER and SUBSCRIPTION are long-haul archive belts (years of records). TIMELINE is the same-day courier belt — once a parcel is picked, it’s gone. REALTIME is a pneumatic tube — five minutes and the message vanishes whether anyone caught it or not.
neighbors on the map
- NATS Subject Taxonomy wiring a new consumer to the right stream
- CI Transition Event Schema vendoring kahn_emit.py into a CI producer
- Agent Transition Event Schema vendoring kahn_agent_emit.py into an agent loop
- NATS Event Bridge subscribing a Choco service to STRATT lifecycle events