CRUMB a card from devarno-cloud

Fleet State Registry — state/ + registry.yaml

petrova intermediate 7 min read

ELI5

Two files describe every governed repo. registry.yaml says who the repo is and what it’s supposed to be doing. state/<slug>.yaml says what the probes actually saw last time they checked. They are allowed to disagree — the dashboard surfaces the gap so you can act on it.

Technical Deep Dive

The two-space split (MR-13)

classDiagram
class RegistryEntry {
slug: kebab-case
url: https URL
default_branch: string
role: control-plane | production | experimental | scaffold
profile: strict | standard | permissive
fleets_allowed: string[]
added: ISO date
contract_sha: 12-hex
integrations_applicability: required | not_applicable | optional
contract_committers: ["humans"] | [...]
notes: paragraph
}
class StateFile {
slug: kebab-case
contract_sha: 12-hex (matches registry)
last_full_sweep: ISO datetime
last_verified_at: per-integration ISO datetime
probe_history: ProbeOutcome[80] (ring buffer)
current_status: per-integration enum
}
class ProbeOutcome {
integration: ares|traceo|crumb|rocky|eva|lore_cairnet
timestamp: ISO datetime
outcome: ok|degraded|failing|unreachable|not_applicable
detail: 200-char string?
evidence_sha: 12-hex?
}
class ConsumerContract {
.petrova/contract.yaml
integrations.<id>.status: pending|wired|not_applicable
evidence: per-integration shape
}
RegistryEntry "1" --> "1" StateFile : slug match
StateFile "1" --> "20" ProbeOutcome : ring buffer
ConsumerContract "1" --> "1" StateFile : observation reflects this intent

registry.yaml (one file in petrova-hq, repos: list) is the directory — it declares which repos petrova governs and how strictly.

state/<slug>.yaml (one file per repo) is the probe ledger — bot-written observation, append-only ring buffer of last 20 probe results per integration (so up to 80 entries given five integrations + the legacy lore_cairnet).

.petrova/contract.yaml (in the consumer repo, not in petrova-hq) is the declared intent — what integrations the repo claims to wire and the evidence backing each claim.

state/ schema highlights (contracts/state.schema.json)

{
"required": ["slug", "contract_sha", "last_full_sweep", "last_verified_at",
"probe_history", "current_status"],
"additionalProperties": false,
"properties": {
"slug": { "pattern": "^[a-z0-9][a-z0-9-]*[a-z0-9]$" },
"contract_sha": { "pattern": "^[a-f0-9]{12}$" },
"probe_history": { "maxItems": 80, ... },
"current_status": {
"properties": {
"ares": { "enum": ["ok","degraded","failing","stale","not_applicable","pending"] },
...
}
}
}
}

maxItems: 80 is the retention budget — older history rolls off. If you need lineage beyond 80 entries, write a finding or decision doc; the state file is recent context, not archive.

registry.yaml fields that gate behaviour

- slug: kahn-hq
role: production
profile: standard
fleets_allowed: [kahn-implementer, kahn-diagnostics, kahn-reviewer, kahn-planner]
integrations_applicability:
ares: required
traceo: required
crumb: not_applicable
rocky: required
eva: not_applicable
contract_committers: ["humans"]
FieldWhat it gates
roleclassification only — informs human decisions, not verb dispatch
profilegates request_merge_when_green — only permissive (or standard with explicit fleets_allowed) allows auto-merge labels
fleets_allowedwhich agent fleet IDs may invoke write verbs against this repo; empty array → humans only
integrations_applicabilitydeclares whether each probe should not_applicable short-circuit or actively check
contract_committerswho can land contract edits (typically humans only)
contract_sha12-hex digest of the consumer’s .petrova/contract.yaml; mismatch with state’s contract_sha signals contract drift

MR-14 — auto-demote only

petrova doctor may transition current_status[<integration>]:

  • ok → degraded (probe surfaced an issue with hysteresis)
  • degraded → failing (sustained issue)
  • failing → unreachable (repo or endpoint gone)

It MUST NOT promote pending → wired or degraded → ok as a contract change. Promotion of a consumer’s contract requires a human PR to that consumer repo’s .petrova/contract.yaml pasting evidence. The next probe round confirms the new state, which may demote current_status back if the evidence doesn’t hold.

This matters: a probe that auto-promoted intent would let the dashboard quietly re-write what the consumer claimed it was doing — exactly the failure mode MR-13 forbids.

Real example — petrova-hq’s own state file

state/petrova-hq.yaml
slug: petrova-hq
contract_sha: 13efa309b813
last_full_sweep: '2026-05-06T16:42:46.271Z'
probe_history:
- integration: rocky
timestamp: '2026-05-06T16:42:46.234Z'
outcome: not_applicable
detail: control plane self-entry — agentic content emitted here is meta...
evidence_sha: null
...
current_status:
ares: not_applicable
traceo: not_applicable
crumb: not_applicable
rocky: not_applicable
eva: not_applicable

petrova-hq is not_applicable across the board — it is the playbook, not a consumer of itself.

Why slug + contract_sha both appear in both files

Both files carry contract_sha so a state read can detect when the registry’s view of the consumer’s contract has moved (consumer landed a PR; the state file’s recorded sha is now stale). The petrova doctor sweep refreshes both atomically.

Key Terms

  • Probe ring buffer — the probe_history array capped at 80 entries; oldest entries roll off.
  • Contract sha — 12-hex digest of .petrova/contract.yaml content; cross-checked between registry and state.
  • fleets_allowed — registry field that whitelists which agent fleet IDs may invoke write verbs against the repo.

Q&A

Q: Which file holds intent and which holds observation? A: .petrova/contract.yaml (in the consumer repo) holds intent; state/<slug>.yaml (in petrova-hq) holds observation. They live in different repos by design (MR-13). The dashboard surfaces the disagreement; neither overwrites the other.

Q: What’s the maximum length of probe_history, and what does that imply about retention? A: maxItems: 80 total — twenty per integration across five integrations (with legacy lore_cairnet still allowed). Older entries roll off. Long-form lineage doesn’t live here; it lives in decision docs or findings.

Q: What does fleets_allowed gate, and what does profile gate? A: fleets_allowed whitelists which agent fleet IDs may invoke write verbs against the repo (empty array = humans only). profile gates request_merge_when_greenpermissive allows auto-merge labels broadly; standard requires the fleet to be explicitly listed; strict (e.g. petrova-hq itself) forbids auto-merge. They compose: a strict profile with a non-empty fleets_allowed still refuses auto-merge.

Examples

petrova doctor runs a daily sweep. For kahn-hq it probes ARES, TRACEO, ROCKY (per integrations_applicability). ARES probe returns degraded (one webhook delivery missed). State file appends a new probe_history entry and demotes current_status.ares: ok → degraded (auto-demote allowed by MR-14). The kahn-hq/.petrova/contract.yaml.integrations.ares.status: wired is not changed — intent stayed the same; only observation moved. The next dashboard render shows kahn-hq’s ARES square amber.

neighbors on the map