CRUMB a card from devarno-cloud

Two-Service MCP & Engine Architecture

traceo beginner 4 min read

ELI5

Traceo is a restaurant with two counters. The MCP counter (:8000) takes orders for requirements CRUD and traceability from Claude and the web UI. The Engine counter (:8001) is the kitchen — it ingests CSV/Excel files asynchronously and reports back through /jobs/{id}.

Technical Deep Dive

Service Split

ServicePathPortEntryPurpose
MCP Servertraceo_mcp_server/8000app.py (FastAPI) + server.py (FastMCP)Requirements CRUD, traceability, Ariel baselines, audit
Engineengine/src/engine/8001server.pyAsync ingestion of CSV/Excel into requirements

Both are Python ASGI apps. The MCP server speaks the MCP protocol over stdio (for Claude clients) and HTTP (for the Next.js client at web/apps/client). The engine speaks HTTP only.

Component Topology

flowchart LR
subgraph clients[Clients]
CLI[Claude MCP via stdio]
WEB[Next.js client]
end
subgraph mcp[MCP Server :8000]
APP[app.py FastAPI]
SRV[server.py FastMCP]
SVC[Services: requirements, traceability, RAG, audit, ariel]
end
subgraph eng[Engine :8001]
ESRV[server.py]
JM[JobManager]
RUST[Optional Rust backend]
end
PG[(PostgreSQL + pgvector)]
CLI --> SRV
WEB --> APP
APP --> SVC
SRV --> SVC
SVC --> PG
WEB --> ESRV
ESRV --> JM
JM --> RUST
ESRV -->|webhook| APP

Middleware Stacks

Both apps mount middleware bottom-up. The execution order at runtime is reversed:

OrderMCP :8000Engine :8001
outermostCORSMiddlewareCORSMiddleware
middleRateLimitMiddleware (100/min)AuthMiddleware (JWT)
innermost(auth applied per-route via decorator)RateLimitMiddleware (50/min)

The engine enforces JWT centrally because every engine endpoint except /health, /ready, /metrics, /docs requires a user. The MCP server applies @require_permission(...) at the route/tool level instead, because some tools are read-only and the auth surface differs between HTTP routes and MCP tools.

Webhook Direction

The engine notifies the MCP server when a job completes via POST /webhooks/... on the MCP side (routes/webhooks.py). This is the only inter-service call in the system; everything else is client-fanout.

Key Terms

  • MCP Server → FastAPI + FastMCP host on :8000 exposing both HTTP routes and MCP tools.
  • Engine → FastAPI app on :8001 whose only job is async file ingestion via JobManager.
  • Webhook → Engine-to-MCP callback used to signal job completion.

Q&A

Q: Why is the rate limit lower on the engine (50/min) than on the MCP server (100/min)? A: Engine endpoints kick off async jobs that consume CPU and DB writes, so each request is far more expensive than an MCP read. The lower budget caps the queue depth a single IP can create.

Q: What gets bypassed for /health on the MCP server? A: /health is exempt from the rate limiter (it would otherwise pollute the bucket and 429 a load balancer probe), but it is not exempt from CORS.

Q: Where does JWT validation live on the MCP server? A: Per-route via @require_permission from traceo_mcp_server/auth/middleware.py — there is no global AuthMiddleware on the MCP server. The engine, by contrast, does install a global AuthMiddleware.

Examples

A web client uploading a CSV calls POST :8001/jobs/ingest with a JWT, gets 202 Accepted and a job_id, then polls GET :8001/jobs/{job_id} until status=completed. Concurrently, the user can call GET :8000/api/requirements/... to view the rows that the engine has begun writing.

neighbors on the map