Synchronous Chain Execution Engine (DEC-002)

iris advanced 8 min read

ELI5

A chain execution is like a relay race where each runner must finish their lap before the next runner starts. There’s a referee at the start line who can cancel the whole race if conditions are bad (before gate), and another referee at the finish line who can disqualify the team if they cheated (after gate). If a runner trips and falls, a medic checks them (on_error gate) before deciding whether the race continues.

Technical Deep Dive

DEC-002: Synchronous Chain Execution with Async I/O

The decision: Chains execute step-by-step synchronously; async I/O is used only for external calls. No async task queues, no fire-and-forget, no parallelism within a chain.

Rationale: Gates must be binding. A veto must halt execution immediately, not after downstream steps have already fired. Councils deliberate; they don’t batch-process. Async execution breaks auditability.

Execution Flow

flowchart TD
    A["ChainExecutor.execute_chain()"] --> B["Parse & validate chain"]
    B --> C["Evaluate BEFORE gates"]
    C -->|Any veto| D["Return: status=vetoed"]
    C -->|All allow| E["Step 0: invoke sprite"]
    E -->|Success| F["Evaluate AFTER gate (step 0)"]
    E -->|Failure| G["Evaluate ON_ERROR gate"]
    F -->|Veto| D
    F -->|Allow| H["Step 1: invoke sprite"]
    G -->|Veto| D
    G -->|Allow| H
    H -->|Success| I["Step 2..."]
    H -->|Failure| J["Evaluate ON_ERROR gate"]
    I -->|All steps complete| K["Evaluate final AFTER gates"]
    K -->|Veto| D
    K -->|Allow| L["Return: status=completed"]
    J -->|Continue| I
    J -->|Veto| D
    D --> M["Persist execution history"]
    L --> M
    M --> N["Return ChainExecutionResult"]

Three-Phase Protocol

sequenceDiagram
    autonumber
    participant C as Client
    participant R as Chain Router
    participant E as ChainExecutor
    participant G as GateEngine
    participant S as Sprite (placeholder)

    C->>R: POST /v1/chains/execute
    R->>E: execute_chain(council_id, chain, input)

    rect rgb(255, 240, 245)
        Note over E,G: Phase 1: Pre-execution gates
        E->>G: evaluate before-gates
        G-->>E: allow / veto
    end

    alt Before-gate veto
        E-->>R: status: vetoed
        R-->>C: 409 GATE_VETO
    else Before-gates allow
        rect rgb(240, 255, 240)
            Note over E,S: Phase 2: Step execution (sequential)
            loop For each step
                E->>S: invoke(action, input_map)
                S-->>E: output
                E->>G: evaluate after-gate
                G-->>E: allow / veto
                alt After-gate veto
                    E-->>R: status: vetoed
                    R-->>C: 409 GATE_VETO
                end
            end
        end

        rect rgb(240, 245, 255)
            Note over E,G: Phase 3: Post-execution gates
            E->>G: evaluate final after-gates
            G-->>E: allow / veto
        end

        alt Final gate veto
            E-->>R: status: vetoed
            R-->>C: 409 GATE_VETO
        else All gates allow
            E-->>R: status: completed
            R-->>C: 200 ChainExecutionResult
        end
    end

Execution Result Model

classDiagram
    class ChainExecutionResult {
        +UUID execution_id
        +UUID council_id
        +UUID chain_id
        +ChainStatus status
        +datetime started_at
        +datetime completed_at
        +integer duration_ms
        +StepExecution[] steps
        +GateEvaluation[] gates
    }
    class StepExecution {
        +integer order
        +UUID sprite_id
        +string action
        +StepStatus status
        +object output
        +string error
    }
    class GateEvaluation {
        +string type
        +UUID sprite_id
        +GateDecision decision
        +string reason
    }
    class ChainStatus {
        <<enumeration>>
        completed
        failed
        vetoed
    }
    ChainExecutionResult --> StepExecution : contains
    ChainExecutionResult --> GateEvaluation : contains
    ChainExecutionResult --> ChainStatus : uses

CouncilExecutor Behaviour (SDK)

The Python SDK’s CouncilExecutor adds an additional layer:

class CouncilExecutor:
    def execute_all_chains(self):
        for chain in self.council.chains:
            result = self.execute_chain(chain.name)
            if result.status == "vetoed":
                break  # Halt entire council on veto!
            elif result.status == "failed":
                log.error(f"Chain {chain.name} failed")
                continue  # Failed chains don't halt council

Key difference: A vetoed chain halts the entire council execution. A failed chain logs an error but the council continues with remaining chains.

State Machine

stateDiagram-v2
    [*] --> Validating: Parse chain
    Validating --> BeforeGates: Structure valid
    BeforeGates --> Executing: All gates allow
    BeforeGates --> Vetoed: Gate vetoes
    Executing --> AfterGate: Step succeeds
    Executing --> OnErrorGate: Step fails
    AfterGate --> NextStep: Gate allows
    AfterGate --> Vetoed: Gate vetoes
    OnErrorGate --> NextStep: Gate allows (continue)
    OnErrorGate --> Vetoed: Gate vetoes
    NextStep --> Executing: More steps
    NextStep --> FinalGates: All steps done
    FinalGates --> Completed: Gates allow
    FinalGates --> Vetoed: Gate vetoes
    Completed --> [*]
    Vetoed --> [*]

Key Terms

Synchronous execution → Steps run one at a time, in order, with no parallelism. Each step must complete before the next begins.
Binding gate → A gate whose veto immediately halts all further execution (no rollback of completed steps, but no new steps start)
Step execution → A single invocation of a sprite’s capability within a chain
ChainExecutionResult → The complete audit record of a chain run: status, timing, per-step outputs, and gate decisions
Veto → A gate decision that halts the chain immediately. Returns HTTP 409 with GATE_VETO code.
Placeholder invocation → Current sprite invocation is simulated ({"status": "simulated"}). Real implementation would use RPC or REST calls to sprite endpoints.

Q&A

Q: Why not make chains async for better performance? A: DEC-002 explicitly rejected async execution because:

Gates must be truly binding — a veto must stop execution before downstream steps fire
Async task queues introduce race conditions where a step might complete after a gate veto
Auditability requires a linear, deterministic execution log (not a DAG)
“Councils deliberate, they don’t batch-process”

Q: What happens to completed steps when a later gate vetoes? A: They remain in the result with status="completed". The chain returns status="vetoed" and includes all gate evaluations. There is no automatic rollback — side effects from completed steps may need manual cleanup.

Q: Can steps run in parallel within a chain? A: No. DEC-002 mandates sequential execution. If you need parallel execution, model it as separate chains within the same council or use external orchestration.

Q: How is execution history persisted? A: The ExecutionHistoryRegistry stores every ChainExecutionResult keyed by execution_id and indexed by chain_id. The GET /v1/chains/{id}/history endpoint provides paginated access with optional status filtering.

Q: What is the maximum chain execution time? A: MAX_CHAIN_EXECUTION_TIME_SECONDS=300 (5 minutes) in iris-service config. The _parse_timeout() method converts ISO 8601 duration strings (e.g., PT5M) to seconds. Full timeout enforcement is planned but not yet implemented.

Examples

Synchronous chain execution is like an airport security checkpoint:

Before gate = TSA checks your ID and boarding pass before you enter the queue
Step 0 = Remove shoes, belt, electronics → place in bins
After gate (step 0) = X-ray operator checks the scan. If something looks suspicious → veto (full pat-down, no one else advances)
Step 1 = Walk through metal detector
After gate (step 1) = If detector beeps → veto (wanded inspection)
Step 2 = Collect belongings from conveyor
Final gate = Gate agent verifies your face matches your ID before you board

If any checkpoint says “stop,” the entire process halts immediately. You don’t proceed to the metal detector while TSA is still examining your bag.

neighbors on the map

REST API — Council Creation & Chain Execution creating a council via the API
End-to-End Chain Execution Request Flow tracing a chain execution through the entire system