Verification Rounds at Phase Close

petrova intermediate 6 min read

ELI5

A verification round is the post-it-walk after the work is built. You don’t fix anything during it — you just walk the deliverables one more time with fresh eyes and write down what you notice. The friction budget then absorbs each note as either trivial (close it), acceptable carry (keep it open), or next-phase work (give it a milestone ID).

Technical Deep Dive

The round, in one picture

flowchart TD
  start["02-phase-close invokes 03-verification-round (MR-10)"]
  start --> probes{"choose probes by phase shape"}
  probes -->|web phase| A["A — Visitor probe (routes, a11y, responsive, perf)"]
  probes -->|infra phase| B["B — Operator probe (deploy/migrate/observe/recover)"]
  probes -->|all phases| C["C — Spec-vs-build (impl + test + traceability)"]
  probes -->|all phases| D["D — Invariant probe (per I-N: preserved? guarded?)"]
  probes -->|all phases| E["E — Drift probe (per drift_watch in north-star)"]
  probes -->|all phases| F["F — Decision-doc audit (MR-7, MR-4)"]

  A --> findings
  B --> findings
  C --> findings
  D --> findings
  E --> findings
  F --> findings

  findings["findings list — F1, F2, ..."]
  findings --> sev{severity}
  sev -->|invariant-violation| block["BLOCK — top of output, route to implementer"]
  sev -->|trivial / minor / substantive| cat["category"]

  cat -->|closed| closed["closed within phase (trivial only)"]
  cat -->|in-budget| inb["in-budget — one-sentence justification"]
  cat -->|deferred| def["deferred → M{N+1}.x.y assigned in same commit"]

  block --> stop["round short-circuits"]
  closed --> reviewer
  inb --> reviewer
  def --> reviewer
  reviewer["reviewer classifies + writes phase-close decision doc"]

Probe taxonomy (`prompts/03-verification-round.md`)

Probe	Lens	Typical findings
A — Visitor	”I am a paying user opening this for the first time”	broken renders, 500/404, layout breaks, A11y gaps, responsive gaps, perf smell
B — Operator	”I am inheriting this system at the phase boundary”	missing runbooks, undocumented failure modes, no rollback path
C — Spec-vs-build	each spec requirement covered by phase	missing impl/test/traceability triad
D — Invariant	per `I-N` in CLAUDE.md	violations (blocking), missing fail-closed guards
E — Drift	per `drift_watches:` in north-star	observed pulls toward an anti-shape, missing counter-anchor
F — Decision-doc audit	walk `docs/decisions/` for this phase	MR-7 unmarked supersession, MR-4 ISO date violations, lingering open questions

Finding shape

Each finding (F<N>) carries six fields:

- Probe: A | B | C | D | E | F
- Surface: <file path / route / module / doc>
- Observation: <one paragraph>
- Severity: trivial | minor | substantive | invariant-violation
- Reproduction: <command or click path>
- Suggested category: closed | in-budget | deferred
- If deferred, suggested milestone parent: M{N+1}.{x}

The two short-circuits

Severity = invariant-violation → blocking. Goes to the top of the output; the reviewer routes to the implementer instead of classifying. These bypass the friction budget by design — they are not friction the phase chose to leave; they are work the phase failed to do.
Substantive + closed simultaneously → forbidden by Step 3 sanity check. “Substantive things must defer; closing them silently is the failure mode MR-2 prevents.” If you’re tempted to close-fix a substantive item during the round, that is the discipline failure.

Ownership split

flowchart LR
  auditor["auditor — owns the round, produces faithful findings list"]
  reviewer["reviewer — reads list, classifies, writes phase-close doc"]
  auditor -->|findings list| reviewer

The split is intentional: the auditor’s job is to probe truthfully without classifying, the reviewer’s job is to decide what each finding means for the budget. Same person doing both is how findings get hidden when they look inconvenient.

Diagnostic counts

02-phase-close.md Step 6 surfaces a leading indicator:

deferred > 3 → reviewer warning: “Phase N deferred a lot. Consider re-reading the phase scope before opening N+1.”
Two phases in a row with deferred > 3 → phase decomposition is wrong; consider re-running bootstrap-style calibration on upcoming phases.
Zero findings → the round didn’t probe deeply enough; push back. (KAHN’s M6.5 round is the canonical reference — surfaced live-dot redesign + cross-view consistency as carry-overs that made phase 7’s plan work.)

Key Terms

Friction budget — the ## Friction budget section of the phase-close decision doc; closed / in-budget / deferred buckets.
Invariant-violation finding — severity that bypasses the budget; routes back to implementer instead of classifying.
Counter-anchor — the code or process anchor that catches a named drift watch; if a drift pull happens and no counter-anchor exists, the round records drift-confirmed.

Q&A

Q: What’s the difference between an invariant-violation finding and a deferred finding? A: Invariant violations short-circuit the round — the implementer must fix them before the phase closes (they are work the phase failed to do, not work the phase chose to defer). Deferred findings get an M{N+1}.x.y milestone in the next phase and the close commits unblocked.

Q: Why is a round with zero findings a smell? A: Because the round’s purpose is to probe what the acceptance gates didn’t catch. Zero findings means either the phase was trivial (rare) or the probes were superficial (likely). The prompt explicitly says: “A round with zero findings is not a sign of perfection; it’s a sign you didn’t probe deeply enough. Push back.”

Q: Which subagent owns the round, and which one classifies the output? A: The auditor owns the round and produces the faithful findings list. The reviewer reads it and classifies into closed / in-budget / deferred, then writes the phase-close decision doc. The split prevents the same person from both finding and explaining-away inconvenient items.

Examples

KAHN’s Phase 6 close runs probes A (visitor) + C (spec-vs-build) + D (invariants) + E (drift) + F (decision audit). Auditor produces 7 findings. Reviewer classifies: 1 closed (typo in runbooks/deploy.md — trivial), 2 in-budget (telemetry seam half-built; live-dot animation jank), 4 deferred (live-dot redesign → M7.2.1; cross-view consistency → M7.2.2; spec §4.3 missing trace → M7.3.1; drift toward “CI-shaped” via test-naming → M7.3.2). Phase-close decision doc cites every finding and milestone.

neighbors on the map

guard.sh & verify.sh Hook Contract writing a pre-send validator that aborts bad inputs
eva doctor Validation Rules diagnosing a doctor FAIL line
Verifier & Verification Protocol diagnosing why verify_zk_proof returns Ok(false) for a proof that appears structurally valid