Audit Checkpoint Fleet Partition
kahn intermediate 5 min read
ELI5
A team of inspectors instead of one. The plumbing inspector signs the plumbing checkpoints, the electrician signs the wiring checkpoints, and a generalist signs anything that doesn’t fit a specialty. Each inspector’s signature is the agent_id stamped on the report.
Technical Deep Dive
audit/*.json files are declarative audit_checkpoint definitions. scripts/audit-runner.py executes them; each checkpoint is itself emitted as a first-class audit_checkpoint event under agent_transitions.schema.json.
Checkpoint Definition
A line from audit/contract-shape.json:
{"checkpoint_id":"audit:contract-shape","category":"contracts", "description":"Verify contracts/kahn_emit.py and contracts/kahn_agent_emit.py expose the events declared in their respective JSON Schemas.", "command":["python","scripts/audits/contract_shape.py"]}Exit 0 → result: pass. Non-zero → result: fail. result: warn is the third permitted value (configurable per checkpoint).
Fleet Partition (2026-04-27)
FLEET_AGENTS maps each checkpoint category 1:1 to a named agent_id:
| category | agent_id |
|---|---|
rls | audit-rls |
coverage | audit-coverage |
contracts | audit-contracts |
deps | audit-deps |
citations | audit-citations |
| fallthrough | audit-runner |
flowchart LR A["audit/*.json (5 files)"] --> R["scripts/audit-runner.py"] R -->|category lookup| F[FLEET_AGENTS] F -->|rls| A1[audit-rls] F -->|coverage| A2[audit-coverage] F -->|contracts| A3[audit-contracts] F -->|deps| A4[audit-deps] F -->|citations| A5[audit-citations] F -->|other| A6[audit-runner] A1 --> E[transitions.jsonl: audit_checkpoint event] A2 --> E A3 --> E A4 --> E A5 --> E A6 --> EOptional Evidence
If the checkpoint command emits JSON to stdout, the runner attaches it as the evidence object on the event. evidence is free-form — schema preserves verbatim — but kept sub-KB.
Run-Scoped Checkpoints
audit_checkpoint.agent_id is the only field schema-wide that accepts null, used when the checkpoint asserts a global invariant (e.g. an /audit/ runner asserting an I-1 file-write scope). Agent-scoped checkpoints carry a real agent_id from the partition above.
Why It Matters
The audit-runner is the dogfood producer for the agent path. The fleet-partition is what makes per-agent diagnostics aggregate cleanly: a flaky audit-rls shows up as a single diagnosis row, not buried under a generic audit-runner blob.
Key Terms
- checkpoint_id → Hierarchical kebab-case identifier (
audit:test-coverage.unit). - FLEET_AGENTS → The category→agent_id map in the audit-runner; partitions diagnostics 1:1 with categories.
- evidence → Free-form JSON payload owned by the producing audit definition; KAHN preserves verbatim.
Q&A
Q: Why is audit-runner the fallthrough rather than rejecting unknown categories?
A: Forward compatibility — adding a new category file shouldn’t break the runner. Its checkpoints temporarily ride under audit-runner until FLEET_AGENTS is updated.
Q: How is result: warn triggered?
A: Per-checkpoint convention; the runner inspects exit-code/stdout per the checkpoint’s contract. The schema enum permits pass | fail | warn so warn is a first-class value, not a workaround.
Q: Where does the partition live in the source?
A: scripts/audit-runner.py defines FLEET_AGENTS; the rationale is in docs/findings/20260427-fleet-shadow-window-open.md.
Examples
audit:rls-enforcement runs against the cloud DSN, exits non-zero on a missing policy, attaches {"missing":["public.alerts"]} as evidence; the runner emits {"event":"audit_checkpoint","agent_id":"audit-rls","checkpoint_id":"audit:rls-enforcement","result":"fail","duration_s":0.42,"evidence":{"missing":["public.alerts"]}}.
neighbors on the map
- Dual Emitter Contract vendoring an emitter into a sister repo
- FNP Observability & Prometheus Metrics monitoring FNP systems