CRUMB a card from devarno-cloud

Audit Checkpoint Fleet Partition

kahn intermediate 5 min read

ELI5

A team of inspectors instead of one. The plumbing inspector signs the plumbing checkpoints, the electrician signs the wiring checkpoints, and a generalist signs anything that doesn’t fit a specialty. Each inspector’s signature is the agent_id stamped on the report.

Technical Deep Dive

audit/*.json files are declarative audit_checkpoint definitions. scripts/audit-runner.py executes them; each checkpoint is itself emitted as a first-class audit_checkpoint event under agent_transitions.schema.json.

Checkpoint Definition

A line from audit/contract-shape.json:

{"checkpoint_id":"audit:contract-shape","category":"contracts",
"description":"Verify contracts/kahn_emit.py and contracts/kahn_agent_emit.py expose the events declared in their respective JSON Schemas.",
"command":["python","scripts/audits/contract_shape.py"]}

Exit 0 → result: pass. Non-zero → result: fail. result: warn is the third permitted value (configurable per checkpoint).

Fleet Partition (2026-04-27)

FLEET_AGENTS maps each checkpoint category 1:1 to a named agent_id:

categoryagent_id
rlsaudit-rls
coverageaudit-coverage
contractsaudit-contracts
depsaudit-deps
citationsaudit-citations
fallthroughaudit-runner
flowchart LR
A["audit/*.json (5 files)"] --> R["scripts/audit-runner.py"]
R -->|category lookup| F[FLEET_AGENTS]
F -->|rls| A1[audit-rls]
F -->|coverage| A2[audit-coverage]
F -->|contracts| A3[audit-contracts]
F -->|deps| A4[audit-deps]
F -->|citations| A5[audit-citations]
F -->|other| A6[audit-runner]
A1 --> E[transitions.jsonl: audit_checkpoint event]
A2 --> E
A3 --> E
A4 --> E
A5 --> E
A6 --> E

Optional Evidence

If the checkpoint command emits JSON to stdout, the runner attaches it as the evidence object on the event. evidence is free-form — schema preserves verbatim — but kept sub-KB.

Run-Scoped Checkpoints

audit_checkpoint.agent_id is the only field schema-wide that accepts null, used when the checkpoint asserts a global invariant (e.g. an /audit/ runner asserting an I-1 file-write scope). Agent-scoped checkpoints carry a real agent_id from the partition above.

Why It Matters

The audit-runner is the dogfood producer for the agent path. The fleet-partition is what makes per-agent diagnostics aggregate cleanly: a flaky audit-rls shows up as a single diagnosis row, not buried under a generic audit-runner blob.

Key Terms

  • checkpoint_id → Hierarchical kebab-case identifier (audit:test-coverage.unit).
  • FLEET_AGENTS → The category→agent_id map in the audit-runner; partitions diagnostics 1:1 with categories.
  • evidence → Free-form JSON payload owned by the producing audit definition; KAHN preserves verbatim.

Q&A

Q: Why is audit-runner the fallthrough rather than rejecting unknown categories? A: Forward compatibility — adding a new category file shouldn’t break the runner. Its checkpoints temporarily ride under audit-runner until FLEET_AGENTS is updated.

Q: How is result: warn triggered? A: Per-checkpoint convention; the runner inspects exit-code/stdout per the checkpoint’s contract. The schema enum permits pass | fail | warn so warn is a first-class value, not a workaround.

Q: Where does the partition live in the source? A: scripts/audit-runner.py defines FLEET_AGENTS; the rationale is in docs/findings/20260427-fleet-shadow-window-open.md.

Examples

audit:rls-enforcement runs against the cloud DSN, exits non-zero on a missing policy, attaches {"missing":["public.alerts"]} as evidence; the runner emits {"event":"audit_checkpoint","agent_id":"audit-rls","checkpoint_id":"audit:rls-enforcement","result":"fail","duration_s":0.42,"evidence":{"missing":["public.alerts"]}}.

neighbors on the map