Score Metric Model

sparki intermediate 5 min read

ELI5

The score subsystem is a scoreboard. Each scribble is a Metric (one number with a unit and a status); a column of those over time is a MetricSeries with min/max/median/stddev pre-computed. Roll-ups for builds and tests live in their own structs so the dashboard does not have to recompute them.

Technical Deep Dive

subsystems/score/types.go declares the metric primitives. The aggregator computes series stats once on insert; queries return them flat.

Type Hierarchy

classDiagram
  class MetricType {
    build
    test
    quality
    deployment
    performance
    custom
  }
  class MetricStatus {
    success
    warning
    failure
    unknown
  }
  class Metric {
    +string ID
    +string Name
    +MetricType Type
    +float64 Value
    +string Unit
    +MetricStatus Status
    +time.Time Timestamp
    +string ProjectPath
    +map Tags
    +map Metadata
  }
  class MetricSeries {
    +string Name
    +MetricType Type
    +Metric[] Metrics
    +time.Time StartTime
    +time.Time EndTime
    +int Count
    +float64 Average
    +float64 Min
    +float64 Max
    +float64 Median
    +float64 StdDev
  }
  class BuildMetrics {
    +int TotalBuilds
    +int SuccessfulBuilds
    +int FailedBuilds
    +float64 SuccessRate
    +Duration AverageDuration
    +Duration MedianDuration
    +Duration FastestBuild
    +Duration SlowestBuild
    +time.Time LastBuildTime
    +MetricStatus LastBuildStatus
    +map FailureReasons
    +TrendDirection TrendDirection
  }
  class TestMetrics {
    +int TotalTests
    +int PassedTests
    +int FailedTests
    +int SkippedTests
    +string[] FlakyTests
  }
  Metric --> MetricType
  Metric --> MetricStatus
  MetricSeries --> Metric : contains
  BuildMetrics --> MetricStatus

Why Pre-Computed Series Aggregates

MetricSeries carries Average, Min, Max, Median, StdDev, Count. These are computed once at insert/aggregation time so dashboard reads are O(1). The trade-off is that out-of-band inserts (back-dated metrics) must trigger a recomputation pass; the file store (file_store.go) handles this.

TrendDirection

Carried on BuildMetrics. Values are categorical (improving / stable / regressing) and are computed from a rolling window over MetricSeries slope.

Flaky Tests

TestMetrics.FlakyTests []string is a list of test identifiers that have observed both pass and fail outcomes within the analysis window. Detection lives in the test collector (test_collector.go); the model only stores the resolved list.

Key Terms

Metric → one timestamped measurement with a categorical status
MetricSeries → a window of metrics plus pre-computed stats
BuildMetrics → roll-up over a window of build outcomes
TrendDirection → categorical slope label on BuildMetrics
FlakyTests → identifiers that flip pass/fail within the window

Q&A

Q: Why have both Status on Metric and MetricStatus on BuildMetrics? A: They mean different things. Metric.Status is per-sample (this single measurement passed/failed thresholds). BuildMetrics.LastBuildStatus is the most recent build’s outcome — a roll-up convenience field.

Q: Are Tags indexed? A: At the model level they are map[string]string, opaque. Whether the storage layer indexes specific tag keys depends on the file/DB store; file_store.go keeps them as JSON.

Q: Is custom MetricType free-form? A: Yes — it exists so plugins can publish metrics without forcing a schema change. The dashboard renders custom metrics generically (name, value, unit) without special charts.

Examples

A build pipeline ends and the score collector emits: Metric{Type:build, Name:"build.duration_ms", Value:182000, Unit:"ms", Status:success} plus Metric{Type:test, Name:"test.pass_rate", Value:0.97, Unit:"ratio", Status:warning}. The aggregator updates today’s MetricSeries, recomputes mean/stddev, and bumps BuildMetrics.LastBuildTime. The dashboard renders the trend arrow from BuildMetrics.TrendDirection.

neighbors on the map

FNP Observability & Prometheus Metrics monitoring FNP systems
OpenTelemetry Instrumentation & Metrics adding observability to iris-service code
Run Outcome Classification interpreting a History row's status pill