CRUMB a card from devarno-cloud

SSE Job Event Stream & Cursor Backfill

so1 advanced 6 min read

ELI5

The server is a sportscaster commentating live. If you walk out for a snack, when you come back you say “I last heard line 42” and the caster fast-forwards through 43-50 before resuming live play-by-play.

Technical Deep Dive

Endpoint

GET /api/jobs/{jobId}/events — opens a persistent SSE connection (HTTP/1.1, text/event-stream). Per ADR-003, SSE was chosen over WebSocket because the channel is server-to-client only and HTTP/1.1 already provides reconnect semantics.

Event Shapes

event: status
data: {"state":"running"}
event: log
data: {"line":"Processing...","cursor":42}
event: status
data: {"state":"success","output":{...}}

Two event names: status (state transitions per so1-006) and log (append-only lines from Job.logs).

Reconnect with Cursor

sequenceDiagram
participant C as Client
participant B as BFF
participant S as Log store
C->>B: GET /api/jobs/{id}/events
B-->>C: event:log line=1 cursor=1
B-->>C: event:log line=42 cursor=42
Note over C: network drop
C->>B: GET /api/jobs/{id}/events?cursor=42
B->>S: read lines 43..end
S-->>B: lines 43..50
B-->>C: backfill 43..50
B-->>C: live stream resumes
B-->>C: event:status state=success

The client tracks the last received cursor. On reconnect it appends ?cursor=N; the server backfills N+1..now from the persisted log store and then resumes the live tail.

Termination

After a terminal status event (success / failed / cancelled per so1-006), the server closes the stream. Clients that arrive late simply receive a single status event with the final state.

Trade-offs (ADR-003)

  • One persistent connection per browser tab — costs scale with viewers.
  • Logs must be persisted (DB or S3) to enable cursor backfill; the store is deferred per ADR-003 Phase 2.

Key Terms

  • SSE → Server-Sent Events: one-way text/event-stream over HTTP/1.1.
  • Cursor → monotonically increasing line number or timestamp identifying a position in the log.
  • Backfill → replaying missed entries after a reconnect before going live.

Q&A

Q: What happens if the client requests ?cursor=42 but the log store has been truncated? A: Per ADR-003 the store is “log retention, deferred”; out-of-range cursors should error with NOT_FOUND or backfill from the oldest available — implementation-defined.

Q: Why two event names instead of a single event: update? A: Status events imply state transitions and can be processed as a small finite set; log events are unbounded text. Splitting them lets clients subscribe to status-only views cheaply.

Q: Does the SSE stream replace GET /api/jobs/{id} polling? A: No — GET /api/jobs/{id} remains a discrete snapshot. SSE is for live following.

Examples

The streaming log viewer in so1-rover opens the SSE on mount, auto-scrolls on each event: log, and on reconnect re-opens with the highest seen cursor, so a tab refresh during a 30-minute job loses no lines.

neighbors on the map