CRUMB a card from devarno-cloud

Execution Trace Format

grace intermediate 6 min read

ELI5

Every chain execution leaves a flight recorder. Top of the box has the chain’s name, version, fingerprint, and total duration. Inside are step-by-step entries — which agent ran, how long it took, hashes of what went in and out, and the gate verdicts. The recorder ships to disk; it does not ship to git.

Technical Deep Dive

Source: automation/trace-protocol.md, @stratt/cli’s lib/trace.ts, lib/run-log-reader.ts. Mirrors IRProgram from @stratt/ir/src/types.ts.

Storage

  • Path: automation/traces/YYYY-MM-DD-{chain-slug}.yaml
  • Gitignored — traces contain prompt + response content.
  • Retention: 90 days local; future-target Cloudflare R2 archive.
  • Sampling: 100% by default; at 100+ runs/day sample at 0.2 (always trace failures and gate resolutions).

Trace Structure

classDiagram
class ChainTrace {
+string trace_id
+URI chain_uri
+semver version
+Fingerprint fingerprint
+string session_id
+string council
+datetime started_at
+datetime completed_at
+int duration_ms
+ChainStatus status
+float quality_score
+TokenCounts token_counts
+StepTrace[] steps
}
class StepTrace {
+string step_id
+URI unit_uri
+Fingerprint unit_fingerprint
+Designation agent
+bool gate
+datetime started_at
+datetime completed_at
+int duration_ms
+StepStatus status
+Fingerprint input_hash
+Fingerprint output_hash
+TokenCounts token_counts
+QualityIndicators quality_indicators
+GateResolution? gate_resolution
}
class GateResolution {
+GateState state
+Designation resolved_by
+datetime resolved_at
+int wait_duration_ms
+string reason
+Fingerprint packet_hash
}
class QualityIndicators {
+bool contract_conformance
+float completeness
+float token_efficiency
}
ChainTrace --> StepTrace : steps
StepTrace --> GateResolution : on gate=true
StepTrace --> QualityIndicators : per step

IRProgram Compatibility

IRProgram FieldTrace Equivalent
chainUrichain_uri
versionversion
meta.councilcouncil
steps[].idsteps[].step_id
steps[].unitUristeps[].unit_uri
steps[].agentsteps[].agent
steps[].gatesteps[].gate
steps[].inputssteps[].input_hash (hashed, not verbatim)
steps[].outputssteps[].output_hash (hashed)
edgesimplicit from step ordering + gate dependencies
failureModescaptured in step status transitions

Why Hash Inputs/Outputs

Verbatim prompt/response content is sensitive (PII, secrets) and large. Storing the Blake3 hash gives non-repudiation (you can prove you ran the same input) without committing the payload. The full payload lives in run logs, merged by timestamp via run-log-reader.ts.

DSPy Export

commands/export-dspy.ts flattens traces to JSONL with six filters: --min-score, --date-range, --version-range, --chain, --exclude-gates, plus an implicit unit filter. The export format matches DSPy MIPROv2’s expected training shape.

Key Terms

  • trace_idtr-{UUIDv7} — UUIDv7 chosen for time-sortability without an extra timestamp index.
  • input_hash / output_hash → Blake3 fingerprint of the payload, stored instead of the payload itself.
  • quality_indicators → Per-step inputs to the chain-level quality_score heuristic (see grace-014).

Q&A

Q: Why are traces gitignored? A: They contain captured prompts and responses, which can hold PII or secrets. The trace format is committed; the trace records are not.

Q: What identifier scheme is trace_id? A: UUIDv7 prefixed with tr-. Time-ordered so traces sort chronologically without needing a separate timestamp index.

Q: Can I reconstruct a chain run from a trace alone? A: You can reconstruct the structure (which units ran, in what order, with what gate verdicts) but not the exact prompts/responses — those live in run logs and are merged by timestamp proximity in run-log-reader.ts.

Examples

A 3-step dev/chain/sol-1-boot@0.1.0 execution with one gate produces a trace where:

  • chain_uri/version/fingerprint pin the exact composition that ran.
  • steps[0].input_hash lets a future verifier confirm the same intake-parse input was supplied.
  • steps[2].gate_resolution.state = APPROVED, resolved_by = LEWIS-06 records the Pathfinder gate decision.
  • quality_score = 0.82 flags the run as Good (≥ 0.80, no action needed).

neighbors on the map