OSS & Cloud Modes
kahn intermediate 5 min read
ELI5
Same stove, two kitchens. The home kitchen reads ingredients off your counter and serves you alone; the restaurant kitchen pulls from a shared pantry, takes orders from many tables, and asks for ID at the door. The recipes are identical; only the supply line differs.
Technical Deep Dive
backend/kahn/server.py boots in one of two modes, selected by --mode flag or KAHN_MODE env var.
Mode Matrix
| Concern | OSS (--mode oss) | Cloud (--mode cloud) |
|---|---|---|
| Bind | 127.0.0.1:8080 (compose enforces) | Public, behind airlock |
| Storage | .kahn/archive/ JSON files + in-memory reduce (I-9) | Postgres |
| Auth | LOCALHOST_PRINCIPAL no-op | Airlock OIDC (JWT or introspection) |
| Multi-tenancy | None | Per-tenant rows; RLS via session var |
| Rate-limiting | Off | In-process token bucket per tenant |
| Compose / Manifest | compose.yaml (KAHN_MODE: oss) | railway.toml + deployment/cloud-api/ |
Boot Flow
flowchart TD A[start server.py] --> B{KAHN_MODE / --mode} B -->|oss| C[localhost bind] B -->|cloud| D[airlock auth wired] C --> E[storage = ArchiveFileBackend] D --> F[storage = PostgresBackend] E --> G[aggregator + diagnostics] F --> G G --> H[FastAPI routes] H --> I[/ws/live + /api/runs/*/]Invariants
- I-8: storage swap is the only behaviour difference; aggregator, diagnostics, server layers don’t fork. Selected by
kahn.storage.select_backend. - I-9: OSS mode is JSON read + in-memory reduce only — no database. A
psycopgimport on the OSS path would be a bug. - SRS §12: OSS is single-port, single-operator, localhost-only.
Cloud Extras (Not in OSS)
cloud_auth.py— JWKS-cached JWT verify + RFC 7662 introspection fallback.api_keys.py—kahn_live_sk_<tenant_prefix>_<secret>ingest credentials.rate_limit.py— in-process bucket (sound only because Railway runs single replica).retention.py/retention_jobs.py— soft-delete + hard-delete on the 90-day beta clock.billing.py— plan + retention day source.
Key Terms
select_backend→ Single switch that decides between archive-file and Postgres backends; rest of the app sees the same Protocol.KAHN_AGENT_VIEWS=on→ Production flag that exposes the agents view.- airlock → External OIDC provider at
~/code/workspace/devarno-cloud/airlock/; KAHN delegates cloud-mode auth to it.
Q&A
Q: What happens if I set KAHN_MODE=cloud but no DSN?
A: Boot fails — select_backend raises and the cloud path requires required-table presence checks (e.g. alerts, migration 009). The error points at the right migration file.
Q: Why does OSS bind to 127.0.0.1 in compose, not 0.0.0.0? A: Per SRS §9 the OSS surface is single-operator, single-machine. The compose file enforces it at the host, not inside the container, so a misconfigured container can’t accidentally expose itself.
Q: Can I run cloud mode with the file backend? A: No supported path. Cloud mode assumes Postgres for tenant rows + RLS; the file backend has no concept of multi-tenancy.
Examples
Local dev:
PYTHONPATH=backend python -m kahn.server --fixture-dir dataCloud deploy: railway.toml provisions the API service; airlock provisions a service:kahn-harness client; tenants.env (untracked) caches the keys for local cloud-shape testing.
neighbors on the map
- Horizontal Scalability Seams planning an S3-backed archive
- Retention Soft & Hard Delete auditing why a run vanished