Kubernetes Deployment Topology
nestr intermediate 5 min read
ELI5
Perch is a small city deployed by one Helm chart: pillars that need disk (Prometheus, Loki, AlertManager) live in StatefulSets, the rest in Deployments, traffic between buildings is gated by NetworkPolicies, and ServiceMonitors are the postal-discovery system that finds new houses (Engines) automatically.
Technical Deep Dive
Single chart at perch/helm, namespace perch-system. Templates render the following workloads:
flowchart TB subgraph perch-system["namespace: perch-system"] direction TB subgraph Stateful["StatefulSets"] P[prometheus<br/>2 replicas] AM[alertmanager<br/>3 replicas HA] LO[loki] end subgraph Stateless["Deployments"] G[grafana] J[jaeger] BR[bridge] CM[cost-monitor] ST[slo-tracker] PE[policy-enforcer] TC[trace-correlator] TQ[thanos-query] end subgraph DaemonSet PT[promtail] end SM[ServiceMonitor CRDs] end SM -.discover.-> P PT -->|tail logs| LO P --> TQ P --> G LO --> G J --> G BR --> P CM --> P ST --> P PE --> ST TC --> J TC --> LOWorkload Reference
| Workload | Kind | Replicas | Notes |
|---|---|---|---|
prometheus | StatefulSet | 2 | scrape interval 15 s, retention 30 d, PVCs per replica |
alertmanager | StatefulSet | 3 | gossip cluster for HA |
loki | StatefulSet | configurable | log store backend for Promtail |
promtail | DaemonSet | per node | tails container logs into Loki |
grafana | Deployment | 2 | SSO + RBAC for dashboards |
jaeger | Deployment | 2 | trace ingest + query |
thanos-query | Deployment | 2 | aggregates Prometheus replicas, S3 backend for long-term |
bridge | Deployment | 1 | federates nestr_* to Folio (nestr-009) |
cost-monitor, slo-tracker, policy-enforcer, trace-correlator | Deployment | 1 each | small Go services (nestr-010) |
Discovery
ServiceMonitors live under perch/k8s/ (one per scrape target: api, shield, relay, aria, manuscript, printery). They select Engine pods by label and let the Prometheus operator generate scrape configs without redeploys.
RBAC & NetworkPolicies
perch/k8s/rbac/ defines a ClusterRole + ClusterRoleBinding + ServiceAccount so Prometheus can list/watch pods cluster-wide. perch/deployment/production/network-policies.yaml restricts ingress/egress: only Grafana exposes externally, custom services accept traffic only from Prometheus and from each other on documented ports.
Long-term Storage
Thanos sidecar pattern (assumed from thanos-query template — sidecar template not enumerated in this revision): Prometheus replicas write blocks to S3-compatible storage; thanos-query fans queries out across replicas and historical S3 blocks. Retention is “infinite” at the object-store layer; hot retention stays at 30 d in Prometheus.
Key Terms
- ServiceMonitor → Prometheus-Operator CRD that turns a label selector into a scrape config.
- HA cluster → AlertManager peers gossip alert state to deduplicate paging across replicas.
- Sidecar → Thanos pattern co-locating an uploader next to each Prometheus replica for S3 offload.
Q&A
Q: Why three AlertManager replicas and not two? A: A two-node cluster cannot tolerate any failure under quorum semantics; three tolerates one. Pages stay deduplicated only while quorum holds.
Q: How does adding a new Engine instance get scraped?
A: It carries the labels matched by an existing ServiceMonitor; Prometheus-Operator regenerates the scrape config on the next reconcile. No chart change required.
Q: Where does long-term metric data live?
A: In the S3 backend behind Thanos. Prometheus PVCs are sized for 30 d hot data; older queries are fanned out to S3 via thanos-query.
Examples
A green-field install: kubectl create ns perch-system && helm install perch ./helm -n perch-system. Wait for prometheus-0, prometheus-1, all three alertmanager-* and grafana-* to reach Ready, then kubectl port-forward svc/perch-grafana 3000:80 to land on the default dashboards already wired against nestr_* and orchestrator_* series.
neighbors on the map
- FNP Kubernetes Multi-Region Architecture deploying FNP across multiple regions
- Deployment Topology & Proxy Conflict Resolution setting up a new environment (kitten/cat/lion)
- LORE+CAIRNET Deployment Topology & Service Map understanding the LORE deployment architecture
- OSS & Cloud Modes self-hosting via docker compose