SSE Job Event Stream & Cursor Backfill
so1 advanced 6 min read
ELI5
The server is a sportscaster commentating live. If you walk out for a snack, when you come back you say “I last heard line 42” and the caster fast-forwards through 43-50 before resuming live play-by-play.
Technical Deep Dive
Endpoint
GET /api/jobs/{jobId}/events — opens a persistent SSE connection (HTTP/1.1, text/event-stream). Per ADR-003, SSE was chosen over WebSocket because the channel is server-to-client only and HTTP/1.1 already provides reconnect semantics.
Event Shapes
event: statusdata: {"state":"running"}
event: logdata: {"line":"Processing...","cursor":42}
event: statusdata: {"state":"success","output":{...}}Two event names: status (state transitions per so1-006) and log (append-only lines from Job.logs).
Reconnect with Cursor
sequenceDiagram participant C as Client participant B as BFF participant S as Log store C->>B: GET /api/jobs/{id}/events B-->>C: event:log line=1 cursor=1 B-->>C: event:log line=42 cursor=42 Note over C: network drop C->>B: GET /api/jobs/{id}/events?cursor=42 B->>S: read lines 43..end S-->>B: lines 43..50 B-->>C: backfill 43..50 B-->>C: live stream resumes B-->>C: event:status state=successThe client tracks the last received cursor. On reconnect it appends ?cursor=N; the server backfills N+1..now from the persisted log store and then resumes the live tail.
Termination
After a terminal status event (success / failed / cancelled per so1-006), the server closes the stream. Clients that arrive late simply receive a single status event with the final state.
Trade-offs (ADR-003)
- One persistent connection per browser tab — costs scale with viewers.
- Logs must be persisted (DB or S3) to enable cursor backfill; the store is deferred per ADR-003 Phase 2.
Key Terms
- SSE → Server-Sent Events: one-way text/event-stream over HTTP/1.1.
- Cursor → monotonically increasing line number or timestamp identifying a position in the log.
- Backfill → replaying missed entries after a reconnect before going live.
Q&A
Q: What happens if the client requests ?cursor=42 but the log store has been truncated?
A: Per ADR-003 the store is “log retention, deferred”; out-of-range cursors should error with NOT_FOUND or backfill from the oldest available — implementation-defined.
Q: Why two event names instead of a single event: update?
A: Status events imply state transitions and can be processed as a small finite set; log events are unbounded text. Splitting them lets clients subscribe to status-only views cheaply.
Q: Does the SSE stream replace GET /api/jobs/{id} polling?
A: No — GET /api/jobs/{id} remains a discrete snapshot. SSE is for live following.
Examples
The streaming log viewer in so1-rover opens the SSE on mount, auto-scrolls on each event: log, and on reconnect re-opens with the highest seen cursor, so a tab refresh during a 30-minute job loses no lines.
neighbors on the map
- WebSocket Session Lifecycle adding a new privileged WS handler
- Timeline Reconstruction implementing a new timeline UI control
- CI Transition Event Schema vendoring kahn_emit.py into a CI producer