Two-Service MCP & Engine Architecture
traceo beginner 4 min read
ELI5
Traceo is a restaurant with two counters. The MCP counter (:8000) takes orders for requirements CRUD and traceability from Claude and the web UI. The Engine counter (:8001) is the kitchen — it ingests CSV/Excel files asynchronously and reports back through /jobs/{id}.
Technical Deep Dive
Service Split
| Service | Path | Port | Entry | Purpose |
|---|---|---|---|---|
| MCP Server | traceo_mcp_server/ | 8000 | app.py (FastAPI) + server.py (FastMCP) | Requirements CRUD, traceability, Ariel baselines, audit |
| Engine | engine/src/engine/ | 8001 | server.py | Async ingestion of CSV/Excel into requirements |
Both are Python ASGI apps. The MCP server speaks the MCP protocol over stdio (for Claude clients) and HTTP (for the Next.js client at web/apps/client). The engine speaks HTTP only.
Component Topology
flowchart LR subgraph clients[Clients] CLI[Claude MCP via stdio] WEB[Next.js client] end subgraph mcp[MCP Server :8000] APP[app.py FastAPI] SRV[server.py FastMCP] SVC[Services: requirements, traceability, RAG, audit, ariel] end subgraph eng[Engine :8001] ESRV[server.py] JM[JobManager] RUST[Optional Rust backend] end PG[(PostgreSQL + pgvector)] CLI --> SRV WEB --> APP APP --> SVC SRV --> SVC SVC --> PG WEB --> ESRV ESRV --> JM JM --> RUST ESRV -->|webhook| APPMiddleware Stacks
Both apps mount middleware bottom-up. The execution order at runtime is reversed:
| Order | MCP :8000 | Engine :8001 |
|---|---|---|
| outermost | CORSMiddleware | CORSMiddleware |
| middle | RateLimitMiddleware (100/min) | AuthMiddleware (JWT) |
| innermost | (auth applied per-route via decorator) | RateLimitMiddleware (50/min) |
The engine enforces JWT centrally because every engine endpoint except /health, /ready, /metrics, /docs requires a user. The MCP server applies @require_permission(...) at the route/tool level instead, because some tools are read-only and the auth surface differs between HTTP routes and MCP tools.
Webhook Direction
The engine notifies the MCP server when a job completes via POST /webhooks/... on the MCP side (routes/webhooks.py). This is the only inter-service call in the system; everything else is client-fanout.
Key Terms
- MCP Server → FastAPI + FastMCP host on
:8000exposing both HTTP routes and MCP tools. - Engine → FastAPI app on
:8001whose only job is async file ingestion viaJobManager. - Webhook → Engine-to-MCP callback used to signal job completion.
Q&A
Q: Why is the rate limit lower on the engine (50/min) than on the MCP server (100/min)? A: Engine endpoints kick off async jobs that consume CPU and DB writes, so each request is far more expensive than an MCP read. The lower budget caps the queue depth a single IP can create.
Q: What gets bypassed for /health on the MCP server?
A: /health is exempt from the rate limiter (it would otherwise pollute the bucket and 429 a load balancer probe), but it is not exempt from CORS.
Q: Where does JWT validation live on the MCP server?
A: Per-route via @require_permission from traceo_mcp_server/auth/middleware.py — there is no global AuthMiddleware on the MCP server. The engine, by contrast, does install a global AuthMiddleware.
Examples
A web client uploading a CSV calls POST :8001/jobs/ingest with a JWT, gets 202 Accepted and a job_id, then polls GET :8001/jobs/{job_id} until status=completed. Concurrently, the user can call GET :8000/api/requirements/... to view the rows that the engine has begun writing.
neighbors on the map
- Observability State & Gaps scoping a tracing or structured-logging rollout
- Two-Service Architecture onboarding to the chronicle-hq monorepo
- purr-api Layered Architecture adding a new feature to purr-api