CRUMB a card from devarno-cloud

5 Locked Architectural Decisions (DEC-001 to DEC-005)

iris intermediate 6 min read

ELI5

IRIS has five “laws” that were decided early on and are very hard to change. They’re like the constitution of a country — they determine how everything else works. They cover: how sprites get their own endpoints, why work happens step-by-step (not all at once), how rules are written, why each piece lives in its own repo, and how identity is proven with special math.

Technical Deep Dive

Decision Register

All decisions are stored in iris-specs/.metadata/decision-register.yaml with locked status. They require extraordinary justification and maintainer consensus to modify.

timeline
title IRIS Architectural Decision Timeline
2026 Q1 : DEC-001 : Separate API Endpoints Per Sprite
: DEC-004 : Microservice Repo Architecture
2026 Q2 : DEC-002 : Synchronous Chain Execution
: DEC-003 : Declarative YAML + Code Hooks
: DEC-005 : Blake3 Canonical Fingerprinting

DEC-001: Separate API Endpoints Per Sprite

Decision: Each sprite gets its own endpoint. No monolithic dispatcher.

Rationale: Decouples versioning, scaling, and debugging.

AspectBenefit
VersioningSprites evolve independently without breaking others
ScalingHot sprites scale separately without affecting cold ones
DebuggingIsolated failure domains — one sprite’s crash doesn’t take down others
OwnershipClear per-sprite responsibility and monitoring

Trade-offs:

  • More infrastructure to manage
  • Explicit cross-sprite contracts needed
  • Service discovery overhead

Implications: Per-sprite deployment configs, separate FastAPI routes, per-sprite dashboards, service mesh recommended for production.

DEC-002: Synchronous Chain Execution with Async I/O

Decision: Chains execute step-by-step synchronously. Async I/O is only for external calls. No task queues or fire-and-forget.

Rationale: Gates must be binding — a veto must halt immediately, not after downstream steps fire.

flowchart LR
A["Synchronous Execution"] --> B["Deterministic order"]
A --> C["Binding gates"]
A --> D["Full auditability"]
A --> E["Simpler testing"]
B --> F["No race conditions"]
C --> G["Veto stops immediately"]
D --> H["Linear log history"]
E --> I["Predictable results"]

Trade-offs:

  • Higher latency for long chains
  • No parallelism within a chain
  • Blocking on slow external I/O

DEC-003: Declarative YAML + Code Hooks for Rule Enforcement

Decision: Rules in YAML (declarative) with optional code hooks for complex logic.

Rationale: Domain experts must reconfigure without engineering. Pure-code locks out non-engineers; pure-config can’t express complex validation.

flowchart TD
A["Rule Source"] --> B["YAML (declarative)"]
A --> C["Code hooks (imperative)"]
B --> D["Domain experts edit"]
B --> E["JSON Schema validates structure"]
C --> F["Engineers implement complex logic"]
C --> G["Decorator-pattern registration"]
D --> H["FastAPI middleware enforces"]
E --> H
F --> H
G --> H

Trade-offs:

  • Two systems to maintain
  • Hook execution latency
  • Potential YAML/code desync

DEC-004: Microservice Repo Architecture

Decision: Each component in its own repo. No monorepo.

Rationale: Monorepos accumulate coupling. Separate repos enforce clean boundaries and independent versioning.

PatternReposBoundariesCI/CD
Monorepo1Soft (tends to couple)Shared, complex
Micro-repo8 (iris-hq)Hard (enforced by repo wall)Independent, simple

Implications:

  • iris-specs is the contract-driven source of truth
  • Schema changes trigger downstream CI in dependent repos
  • Independent release cadences per component
  • Contract-driven development is mandatory

Trade-offs:

  • Cross-repo coordination required
  • Manual schema propagation
  • Multiple clones for local development

DEC-005: Blake3 Canonical Fingerprinting

Decision: Blake3 with canonical serialization (STRATT-compatible). SHA-256 documented fallback.

Rationale: Blake3 is ~3× faster than SHA-256 while cryptographically strong. STRATT already uses it, enabling cross-system verification.

flowchart LR
A["5-Stage Pipeline"] --> B["1. YAML parse"]
B --> C["2. Object transform<br/>(strip id, fingerprint, $schema)"]
C --> D["3. Canonical JSON<br/>(compact, sorted keys)"]
D --> E["4. UTF-8 encode"]
E --> F["5. Blake3 hash<br/>→ 64-char hex"]

Fingerprint format: blake3:{64-char-hex} Fallback: SHA-256 via Web Crypto API if blake3-wasm becomes unmaintained.

Decision Status Lifecycle

stateDiagram-v2
[*] --> Proposed: Initial proposal
Proposed --> Accepted: Maintainer review
Accepted --> Locked: Extraordinary justification required
Accepted --> Proposed: New evidence emerges
Locked --> Accepted: Unanimous maintainer consensus
Proposed --> Rejected: Counter-evidence
Accepted --> Rejected: Counter-evidence
Rejected --> [*]
StatusMeaningCan Change?
proposedUnder discussionYes, with evidence
acceptedActive, standard practiceYes, with evidence
lockedFinal, constitutionalOnly with extraordinary justification + maintainer consensus

Key Terms

  • Decision register → Formal record of architectural decisions with status, rationale, pros/cons, and implications
  • Locked decision → A decision that requires extraordinary justification and maintainer consensus to modify
  • Micro-repo architecture → Each service/component in its own repository to enforce clean boundaries
  • Declarative YAML → Human-readable configuration that domain experts can edit without coding
  • Code hooks → Imperative logic attached to declarative rules for complex validation
  • Canonical serialization → Deterministic, unambiguous data representation for cryptographic hashing

Q&A

Q: Can a locked decision ever be changed? A: Yes, but it requires extraordinary justification and unanimous maintainer consensus. The bar is intentionally high to prevent architectural churn.

Q: Why not use a monorepo with strict boundaries? A: DEC-004 explicitly rejected this because “soft boundaries tend to couple.” Repository walls are the strongest boundary enforcement mechanism available in Git-based workflows.

Q: What if a domain expert writes invalid YAML? A: JSON Schema validation catches structural errors before rule activation. The scripts/validate-sprite.py tool and CI gates enforce this.

Q: Why Blake3 and not SHA-3 or SHA-256? A: Speed. Blake3 achieves ~3 GB/s single-core vs. ~1 GB/s for SHA-256. For an ecosystem that fingerprints every sprite on every update, this performance difference is significant at scale.

Q: Where are these decisions documented? A: iris-specs/.metadata/decision-register.yaml is the authoritative source. Each decision includes origin (e.g., persona-dev), rationale, pros, cons, and implications.

Examples

Architectural decisions are like the rules of chess:

  • DEC-001 = Each piece moves differently (no universal “move any piece anywhere” rule)
  • DEC-002 = Players take turns (no simultaneous moves; you must wait for your opponent)
  • DEC-003 = Written rules plus house rules (the rulebook is fixed, but you can agree on time controls)
  • DEC-004 = Each player brings their own set (no shared pieces that could get mixed up)
  • DEC-005 = Every piece has a unique serial number carved into it (proves authenticity)

Changing a locked decision is like changing chess to allow simultaneous moves — it’s possible, but it breaks every strategy book, every tournament format, and every player’s intuition. You’d need overwhelming evidence that chess would be better that way.

neighbors on the map