Promotion Lifecycle Gates
eva intermediate 5 min read
ELI5
A prompt grows up through three school grades. To leave kindergarten (draft) you need three signed homework slips. To leave middle school (tested) you need ten signed slips and a passing report card (eva eval green). The principal (--force) can sign anyone through, but logs a yellow warning.
Technical Deep Dive
State Machine
stateDiagram-v2 [*] --> draft draft --> tested : promote (≥ min_uses successful runs) tested --> ready : promote (≥ min_uses verified runs + green eval) ready --> [*] draft --> tested : --force (warning) tested --> ready : --force (warning)Gate Definitions (bin/eva:831-883)
| Transition | Default min_uses | Counted as | Eval required? |
|---|---|---|---|
| draft → tested | 3 | rows in .usage.jsonl where sent=true and verified ≠ false (i.e. true OR null) | no |
| tested → ready | 10 | rows where sent=true and verified == true (strict) | yes — eval.yml present and last .eval.jsonl row all_passed=true |
Per-prompt overrides live in meta.promotion.{min_uses, require_eval}. --force prints each unmet gate to stderr as a yellow warning then proceeds.
Decision Flow
flowchart TD start["eva promote <id>"] --> cur{current status} cur -- ready --> halt["die: cannot promote (already at top)"] cur -- draft --> g1["successful = sent AND verified ≠ false"] cur -- tested --> g2["verified = sent AND verified == true"] g1 --> ok1{count ≥ min_uses?} g2 --> ok2{count ≥ min_uses AND eval green?} ok1 -- no --> fail["red: list failed gates"] ok2 -- no --> fail ok1 -- yes --> bump["set status=tested; updated=today"] ok2 -- yes --> bump2["set status=ready; updated=today"] fail --> force{--force?} force -- yes --> bump force -- no --> exit1["return 1"]Versioning Convention
Per README: patch = wording, minor = section/constraint change, major = structural rewrite or first promotion to ready. Breaking changes to a ready prompt are forked: eva new <id>-v2 --from <id> restarts at draft.
Key Terms
- successful run —
sent: truerow whoseverifiedfield is true OR null; counts towardtestedgate. - verified run —
sent: truerow whoseverifiedfield is strictlytrue; counts towardreadygate. - promotion gates — the
meta.promotionmapping lets a prompt overridemin_usesandrequire_eval.
Q&A
Q: How many successful sent runs are needed for draft → tested by default?
A: Three. min_uses defaults to 3 for the tested target (bin/eva:848).
Q: What additional artefact is required for tested → ready?
A: An eval.yml whose most recent eva eval recorded all_passed: true in .eval.jsonl (bin/eva:854-863). Bypassable with meta.promotion.require_eval: false.
Q: How does —force differ from a clean promotion?
A: Same status bump and updated stamp, but unmet gates are written to stderr in yellow rather than blocking the bump (bin/eva:868-872).
Examples
Block of meta.yml lowering the gate for an experimental prompt:
promotion: min_uses: 1 require_eval: falseneighbors on the map
- .usage.jsonl Append Format computing prompt usage stats outside of eva show
- eva eval Cases, Assertions & Judges adding a new case to eval.yml
- Site Hosting Modes & Lifecycle Stages adding a new fork branching on platform vs user_git
- CAIRNET+LORE Graduation Pipeline implementing the graduation flow