CRUMB a card from devarno-cloud

Gate Engine & Veto Mechanics

iris intermediate 4 min read

ELI5

A gate is like a security checkpoint at an airport. Before you board, TSA checks your ID. If something looks wrong, they stop you immediately — no “let’s keep going and see.” The gate engine is the TSA officer: it evaluates conditions and either waves you through (allow) or stops everything (veto).

Technical Deep Dive

Gate Evaluation Flow

flowchart TD
A["GateEngine.evaluate_gate()"] --> B["_evaluate_condition()"]
B --> C{"Condition type?"}
C -->|None| D["Allow (default)"]
C -->|scope_check| E["Check context.task_scope<br/>against allowed_scopes"]
C -->|safety_review| F["Check context.dangerous_operations"]
C -->|Unknown| G["Allow with warning log"]
E -->|In scope| H["Decision: allow"]
E -->|Out of scope| I["Decision: veto"]
F -->|No dangerous ops| H
F -->|Dangerous ops found| I
G --> H
D --> H
H --> J["Record metrics<br/>counter + histogram"]
I --> J
J --> K["Return GateEvaluation"]
B -->|Exception| L["Fail-open: allow"]
L --> M["Record error metric"]
M --> K

GateEngine Implementation

The GateEngine lives in iris-service/app/engines/gate_engine.py and is wrapped by ChainExecutor.

class GateEngine:
def evaluate_gate(gate_type, sprite_id, condition, context) -> GateEvaluation:
try:
decision, reason = _evaluate_condition(gate_type, condition, context)
evaluation = GateEvaluation(
type=gate_type,
sprite_id=sprite_id,
decision=decision, # "allow" or "veto"
reason=reason
)
record_gate_decision(gate_name, decision, duration_ms)
return evaluation
except Exception:
# Fail-open: allow execution on engine error
record_gate_decision(gate_name, "error", duration_ms)
return GateEvaluation(
type=gate_type,
sprite_id=sprite_id,
decision="allow",
reason="Gate engine error; fail-open"
)

Condition Types (Current Implementation)

ConditionEvaluation LogicVeto Trigger
NoneAlways allowsNever
scope_checkcontext["task_scope"]context["allowed_scopes"]Task scope not in allowed list
safety_reviewcontext["dangerous_operations"] is empty or falseDangerous operations detected
UnknownAllows with warning logNever

Gate Evaluation Sequence

sequenceDiagram
participant Executor as ChainExecutor
participant Engine as GateEngine
participant Metrics as OTel Metrics
Executor->>Engine: evaluate_gate(before, sprite_123, "scope_check", ctx)
Engine->>Engine: _evaluate_condition()
alt Condition: scope_check
Engine->>Engine: Check ctx["task_scope"] in ctx["allowed_scopes"]
alt In scope
Engine-->>Executor: decision: allow, reason: "Task scope approved"
else Out of scope
Engine-->>Executor: decision: veto, reason: "Task scope not authorised"
end
else Condition: None
Engine-->>Executor: decision: allow, reason: "No condition set"
else Exception thrown
Engine-->>Executor: decision: allow, reason: "Gate engine error - fail-open"
end
Engine->>Metrics: record_gate_decision(counter, histogram)

Metrics Instrumentation

Every gate evaluation emits OpenTelemetry metrics:

MetricTypeLabels
gate_decisions_totalCountergate_name, decision (approved/denied/error)
gate_evaluation_duration_secondsHistogramgate_name, decision

Fail-Open Design Rationale

The engine defaults to fail-open (allow on error) rather than fail-closed. This is an intentional design choice for the current placeholder implementation:

  • Fail-open: If the gate engine crashes or encounters an unknown condition, execution continues. Prevents a broken gate from halting all operations.
  • Fail-closed (future): Production deployments should configure strict fail-closed behaviour where engine errors result in veto.

⚠️ Security Note: Fail-open is acceptable during development but should be hardened to fail-closed for production deployments handling sensitive operations.

GateDecision Model

classDiagram
class GateEvaluation {
+string type
+UUID sprite_id
+GateDecision decision
+string reason
}
class GateDecision {
<<enumeration>>
allow
veto
}
GateEvaluation --> GateDecision : uses

Key Terms

  • Gate → A veto-capable checkpoint evaluated during chain execution at before, after, or on_error positions
  • Gate authority → The single sprite per council authorised to make gate decisions
  • Fail-open → Defaulting to allow when the gate engine encounters an error (current behaviour)
  • Fail-closed → Defaulting to veto on engine error (recommended for production)
  • Condition → A string expression evaluated against the execution context (e.g., scope_check, safety_review)
  • Veto message → Human-readable explanation logged when a gate halts execution
  • Gate decision → Either allow (proceed) or veto (halt immediately)

Q&A

Q: Why is the gate engine fail-open? A: The current implementation is a placeholder. Fail-open prevents a partially implemented gate system from accidentally blocking all chain execution during development. Production hardening should switch to fail-closed.

Q: Can I write custom gate conditions? A: Not yet via configuration. The engine supports scope_check, safety_review, and None. Custom conditions require modifying GateEngine._evaluate_condition(). The SDK’s ChainExecutor accepts a custom gate_evaluator callable for client-side logic.

Q: What happens to steps already completed when a gate vetoes? A: They remain in the ChainExecutionResult with status="completed". The overall chain status becomes vetoed, and the gates array includes the veto decision with its reason.

Q: How do I know which gate vetoed? A: The ChainExecutionResult.gates array contains GateEvaluation objects with sprite_id, type (before/after/on_error), decision (veto), and reason.

Q: Can a gate be bypassed? A: Only if the gate authority sprite is not protected. Protected sprites (protected: true) cannot be bypassed under any circumstances.

Examples

A gate is like a building’s fire suppression system:

  • Smoke detector (before gate) = Checks air quality before anyone enters the lab. If smoke detected → lockdown.
  • Sprinkler system (after gate) = After an experiment finishes, checks temperature. If too hot → flood the room to prevent fire spread.
  • Emergency exit (on_error gate) = If an experiment explodes, the on_error gate checks if evacuation is needed.
  • Fail-open = If the smoke detector’s battery dies, the door stays unlocked (people can still work — risk accepted during construction)
  • Fail-closed = If the smoke detector fails, the door locks automatically (nobody enters until it’s fixed — safer for occupied buildings)

neighbors on the map