CRUMB a card from devarno-cloud

Agent Transition Event Schema

kahn intermediate 7 min read

ELI5

Like the CI label scheme, but for thinking workers. Each parcel says which thinker sent it, what they were doing (thinking, calling a tool, reflecting), and at the end whether they reached a stable answer — not whether their shell returned zero.

Technical Deep Dive

contracts/schemas/agent_transitions.schema.json is the agent-fleet analogue of the CI schema. Same {ts, run_id, event} base, same wire shape — different per-event payloads.

Event Variants

classDiagram
class AgentEvent {
+string ts
+string run_id
+string event
}
class agent_run_start {
+AgentId agent_id
+string task
+string? model
}
class agent_transition {
+AgentId agent_id
+int step
+AgentStatus from
+AgentStatus to
+string? reason
}
class tool_invocation {
+AgentId agent_id
+int step
+string tool_name
+float duration_s
+bool ok
+string? input_summary
+string? output_summary
+string? error
}
class audit_checkpoint {
+AgentId? agent_id
+CheckpointId checkpoint_id
+AuditResult result
+float duration_s
+object? evidence
}
class agent_run_end {
+AgentId agent_id
+AgentOutcome outcome
+int total_steps
+int total_tool_calls
+int total_audit_checkpoints
+int audits_passed
+int audits_failed
+float total_duration_s
+float? convergence_score
}
AgentEvent <|-- agent_run_start
AgentEvent <|-- agent_transition
AgentEvent <|-- tool_invocation
AgentEvent <|-- audit_checkpoint
AgentEvent <|-- agent_run_end

Enums

  • agent_status: thinking | tool_call | tool_result | response | reflect | blocked-on-clarification | converged | failed
  • agent_outcome: converged | partial | escaped | aborted
  • audit_result: pass | fail | warn

Status Lifecycle

stateDiagram-v2
[*] --> thinking
thinking --> tool_call
tool_call --> tool_result
tool_result --> response
response --> reflect
reflect --> thinking
reflect --> converged
thinking --> blocked_on_clarification
blocked_on_clarification --> thinking: operator response
thinking --> failed
converged --> [*]
failed --> [*]

ID Patterns

FieldPatternNote
agent_id^[a-z0-9][a-z0-9:-]{0,63}$Colons allowed for namespacing (claude-subagent:explore)
checkpoint_id^[a-z0-9][a-z0-9:.-]{0,127}$Colons + dots (audit:test-coverage.unit)
stepint ≥ 0Monotonic within (run_id, agent_id)

Special Case: Run-Scoped Audit

audit_checkpoint.agent_id is the only field schema-wide that accepts null — used when a checkpoint asserts a global invariant rather than agent behaviour.

convergence_score

Optional float in [0,1] on agent_run_end. 1.0 = fully converged; below 1.0 = partial. Producer-defined how to compute; KAHN only stores.

Key Terms

  • converged → The product success signal. Distinct from a process exit code; an agent can exit 0 and still be partial.
  • blocked-on-clarification → The agent paused itself awaiting operator input — one of three fleet-specific powers KAHN admits.
  • step → Monotonic counter scoped to (run_id, agent_id); same agent across different runs does not share step numbers.

Q&A

Q: Why are step numbers scoped to (run_id, agent_id) and not just run_id? A: Because a fleet run can host multiple agents in parallel; per-agent monotonic step counters preserve per-agent ordering without forcing a global sequence number.

Q: What’s the difference between escaped and aborted outcomes? A: escaped = the agent wandered out of scope but exited gracefully. aborted = external termination (timeout, kill, operator).

Q: Can tool_invocation carry the tool’s full output? A: No. output_summary is capped at 2048 chars. Producers SHOULD redact secrets — KAHN does no scrubbing.

Examples

A coder agent that converges:

{"ts":"2026-05-05T09:00:00.000Z","run_id":"agent-coder-1","event":"agent_run_start","agent_id":"coder","task":"Add /v2/health"}
{"ts":"2026-05-05T09:00:01.200Z","run_id":"agent-coder-1","event":"agent_transition","agent_id":"coder","step":0,"from":"thinking","to":"tool_call","reason":"tool_call:Read"}
{"ts":"2026-05-05T09:00:01.250Z","run_id":"agent-coder-1","event":"tool_invocation","agent_id":"coder","step":0,"tool_name":"Read","duration_s":0.04,"ok":true}
{"ts":"2026-05-05T09:01:05.000Z","run_id":"agent-coder-1","event":"agent_run_end","agent_id":"coder","outcome":"converged","total_steps":12,"total_tool_calls":7,"total_audit_checkpoints":2,"audits_passed":2,"audits_failed":0,"total_duration_s":65.0,"convergence_score":1.0}

neighbors on the map