Fleet State Registry — state/ + registry.yaml
petrova intermediate 7 min read
ELI5
Two files describe every governed repo. registry.yaml says who the repo is and what it’s supposed to be doing. state/<slug>.yaml says what the probes actually saw last time they checked. They are allowed to disagree — the dashboard surfaces the gap so you can act on it.
Technical Deep Dive
The two-space split (MR-13)
classDiagram class RegistryEntry { slug: kebab-case url: https URL default_branch: string role: control-plane | production | experimental | scaffold profile: strict | standard | permissive fleets_allowed: string[] added: ISO date contract_sha: 12-hex integrations_applicability: required | not_applicable | optional contract_committers: ["humans"] | [...] notes: paragraph } class StateFile { slug: kebab-case contract_sha: 12-hex (matches registry) last_full_sweep: ISO datetime last_verified_at: per-integration ISO datetime probe_history: ProbeOutcome[80] (ring buffer) current_status: per-integration enum } class ProbeOutcome { integration: ares|traceo|crumb|rocky|eva|lore_cairnet timestamp: ISO datetime outcome: ok|degraded|failing|unreachable|not_applicable detail: 200-char string? evidence_sha: 12-hex? } class ConsumerContract { .petrova/contract.yaml integrations.<id>.status: pending|wired|not_applicable evidence: per-integration shape } RegistryEntry "1" --> "1" StateFile : slug match StateFile "1" --> "20" ProbeOutcome : ring buffer ConsumerContract "1" --> "1" StateFile : observation reflects this intentregistry.yaml (one file in petrova-hq, repos: list) is the directory — it declares which repos petrova governs and how strictly.
state/<slug>.yaml (one file per repo) is the probe ledger — bot-written observation, append-only ring buffer of last 20 probe results per integration (so up to 80 entries given five integrations + the legacy lore_cairnet).
.petrova/contract.yaml (in the consumer repo, not in petrova-hq) is the declared intent — what integrations the repo claims to wire and the evidence backing each claim.
state/ schema highlights (contracts/state.schema.json)
{ "required": ["slug", "contract_sha", "last_full_sweep", "last_verified_at", "probe_history", "current_status"], "additionalProperties": false, "properties": { "slug": { "pattern": "^[a-z0-9][a-z0-9-]*[a-z0-9]$" }, "contract_sha": { "pattern": "^[a-f0-9]{12}$" }, "probe_history": { "maxItems": 80, ... }, "current_status": { "properties": { "ares": { "enum": ["ok","degraded","failing","stale","not_applicable","pending"] }, ... } } }}maxItems: 80 is the retention budget — older history rolls off. If you need lineage beyond 80 entries, write a finding or decision doc; the state file is recent context, not archive.
registry.yaml fields that gate behaviour
- slug: kahn-hq role: production profile: standard fleets_allowed: [kahn-implementer, kahn-diagnostics, kahn-reviewer, kahn-planner] integrations_applicability: ares: required traceo: required crumb: not_applicable rocky: required eva: not_applicable contract_committers: ["humans"]| Field | What it gates |
|---|---|
role | classification only — informs human decisions, not verb dispatch |
profile | gates request_merge_when_green — only permissive (or standard with explicit fleets_allowed) allows auto-merge labels |
fleets_allowed | which agent fleet IDs may invoke write verbs against this repo; empty array → humans only |
integrations_applicability | declares whether each probe should not_applicable short-circuit or actively check |
contract_committers | who can land contract edits (typically humans only) |
contract_sha | 12-hex digest of the consumer’s .petrova/contract.yaml; mismatch with state’s contract_sha signals contract drift |
MR-14 — auto-demote only
petrova doctor may transition current_status[<integration>]:
ok → degraded(probe surfaced an issue with hysteresis)degraded → failing(sustained issue)failing → unreachable(repo or endpoint gone)
It MUST NOT promote pending → wired or degraded → ok as a contract change. Promotion of a consumer’s contract requires a human PR to that consumer repo’s .petrova/contract.yaml pasting evidence. The next probe round confirms the new state, which may demote current_status back if the evidence doesn’t hold.
This matters: a probe that auto-promoted intent would let the dashboard quietly re-write what the consumer claimed it was doing — exactly the failure mode MR-13 forbids.
Real example — petrova-hq’s own state file
slug: petrova-hqcontract_sha: 13efa309b813last_full_sweep: '2026-05-06T16:42:46.271Z'probe_history: - integration: rocky timestamp: '2026-05-06T16:42:46.234Z' outcome: not_applicable detail: control plane self-entry — agentic content emitted here is meta... evidence_sha: null ...current_status: ares: not_applicable traceo: not_applicable crumb: not_applicable rocky: not_applicable eva: not_applicablepetrova-hq is not_applicable across the board — it is the playbook, not a consumer of itself.
Why slug + contract_sha both appear in both files
Both files carry contract_sha so a state read can detect when the registry’s view of the consumer’s contract has moved (consumer landed a PR; the state file’s recorded sha is now stale). The petrova doctor sweep refreshes both atomically.
Key Terms
- Probe ring buffer — the
probe_historyarray capped at 80 entries; oldest entries roll off. - Contract sha — 12-hex digest of
.petrova/contract.yamlcontent; cross-checked between registry and state. - fleets_allowed — registry field that whitelists which agent fleet IDs may invoke write verbs against the repo.
Q&A
Q: Which file holds intent and which holds observation?
A: .petrova/contract.yaml (in the consumer repo) holds intent; state/<slug>.yaml (in petrova-hq) holds observation. They live in different repos by design (MR-13). The dashboard surfaces the disagreement; neither overwrites the other.
Q: What’s the maximum length of probe_history, and what does that imply about retention?
A: maxItems: 80 total — twenty per integration across five integrations (with legacy lore_cairnet still allowed). Older entries roll off. Long-form lineage doesn’t live here; it lives in decision docs or findings.
Q: What does fleets_allowed gate, and what does profile gate?
A: fleets_allowed whitelists which agent fleet IDs may invoke write verbs against the repo (empty array = humans only). profile gates request_merge_when_green — permissive allows auto-merge labels broadly; standard requires the fleet to be explicitly listed; strict (e.g. petrova-hq itself) forbids auto-merge. They compose: a strict profile with a non-empty fleets_allowed still refuses auto-merge.
Examples
petrova doctor runs a daily sweep. For kahn-hq it probes ARES, TRACEO, ROCKY (per integrations_applicability). ARES probe returns degraded (one webhook delivery missed). State file appends a new probe_history entry and demotes current_status.ares: ok → degraded (auto-demote allowed by MR-14). The kahn-hq/.petrova/contract.yaml.integrations.ares.status: wired is not changed — intent stayed the same; only observation moved. The next dashboard render shows kahn-hq’s ARES square amber.
neighbors on the map
- Workspace Registry YAML Schema adding a workspace row to registry.yaml
- Graph Topology Snapshot authoring a new graph.json
- Operations & Versions Schema writing a new sync query