CRUMB a card from devarno-cloud

OSS & Cloud Modes

kahn intermediate 5 min read

ELI5

Same stove, two kitchens. The home kitchen reads ingredients off your counter and serves you alone; the restaurant kitchen pulls from a shared pantry, takes orders from many tables, and asks for ID at the door. The recipes are identical; only the supply line differs.

Technical Deep Dive

backend/kahn/server.py boots in one of two modes, selected by --mode flag or KAHN_MODE env var.

Mode Matrix

ConcernOSS (--mode oss)Cloud (--mode cloud)
Bind127.0.0.1:8080 (compose enforces)Public, behind airlock
Storage.kahn/archive/ JSON files + in-memory reduce (I-9)Postgres
AuthLOCALHOST_PRINCIPAL no-opAirlock OIDC (JWT or introspection)
Multi-tenancyNonePer-tenant rows; RLS via session var
Rate-limitingOffIn-process token bucket per tenant
Compose / Manifestcompose.yaml (KAHN_MODE: oss)railway.toml + deployment/cloud-api/

Boot Flow

flowchart TD
A[start server.py] --> B{KAHN_MODE / --mode}
B -->|oss| C[localhost bind]
B -->|cloud| D[airlock auth wired]
C --> E[storage = ArchiveFileBackend]
D --> F[storage = PostgresBackend]
E --> G[aggregator + diagnostics]
F --> G
G --> H[FastAPI routes]
H --> I[/ws/live + /api/runs/*/]

Invariants

  • I-8: storage swap is the only behaviour difference; aggregator, diagnostics, server layers don’t fork. Selected by kahn.storage.select_backend.
  • I-9: OSS mode is JSON read + in-memory reduce only — no database. A psycopg import on the OSS path would be a bug.
  • SRS §12: OSS is single-port, single-operator, localhost-only.

Cloud Extras (Not in OSS)

  • cloud_auth.py — JWKS-cached JWT verify + RFC 7662 introspection fallback.
  • api_keys.pykahn_live_sk_<tenant_prefix>_<secret> ingest credentials.
  • rate_limit.py — in-process bucket (sound only because Railway runs single replica).
  • retention.py / retention_jobs.py — soft-delete + hard-delete on the 90-day beta clock.
  • billing.py — plan + retention day source.

Key Terms

  • select_backend → Single switch that decides between archive-file and Postgres backends; rest of the app sees the same Protocol.
  • KAHN_AGENT_VIEWS=on → Production flag that exposes the agents view.
  • airlock → External OIDC provider at ~/code/workspace/devarno-cloud/airlock/; KAHN delegates cloud-mode auth to it.

Q&A

Q: What happens if I set KAHN_MODE=cloud but no DSN? A: Boot fails — select_backend raises and the cloud path requires required-table presence checks (e.g. alerts, migration 009). The error points at the right migration file.

Q: Why does OSS bind to 127.0.0.1 in compose, not 0.0.0.0? A: Per SRS §9 the OSS surface is single-operator, single-machine. The compose file enforces it at the host, not inside the container, so a misconfigured container can’t accidentally expose itself.

Q: Can I run cloud mode with the file backend? A: No supported path. Cloud mode assumes Postgres for tenant rows + RLS; the file backend has no concept of multi-tenancy.

Examples

Local dev:

PYTHONPATH=backend python -m kahn.server --fixture-dir data

Cloud deploy: railway.toml provisions the API service; airlock provisions a service:kahn-harness client; tenants.env (untracked) caches the keys for local cloud-shape testing.

neighbors on the map