CRUMB a card from devarno-cloud

Retention Soft & Hard Delete

kahn intermediate 4 min read

ELI5

The recycle bin pattern. When a run’s clock runs out, it goes to the bin (still recoverable for a week). After the week, the truck comes and it’s gone for good. Anything in the bin can still be exported on its way out.

Technical Deep Dive

backend/kahn/retention.py is the pure classifier closing invariant I-11 operationally. The Phase 1+2 beta rule:

retention_expires_at = started_at + BETA_RETENTION_DAYS (90 days)
soft-delete: within 24h of expiry
hard-delete: 7 days after soft-delete
export endpoint: returns soft-deleted runs (a customer on their last
day of retention can export runs that expired 3 days
ago but aren't yet hard-deleted)

Lifecycle

stateDiagram-v2
[*] --> active
active --> soft_deleted: now >= started_at + 90d (within 24h)
soft_deleted --> hard_deleted: now >= soft_deleted_at + 7d
soft_deleted --> exported: customer hits export endpoint
exported --> hard_deleted: same 7d clock applies
hard_deleted --> [*]

Module Shape

SymbolRole
BETA_RETENTION_DAYSPulled from billing.py; 90 in beta
SOFT_DELETE_GRACE24h window after expiry
HARD_DELETE_GRACE7d after soft-delete
compute_expiry(started_at)Returns started_at + BETA_RETENTION_DAYS
classify_run(...)Returns RunRetentionDecision (active, soft_delete, hard_delete, noop)

The module is pure — no DB access. Decisions are row-level; the orchestration sits in deployment/cloud-api/ scripts (not in this module), which call the classifier per row and then issue the SQL.

Key Terms

  • soft delete → Row marked deleted but still readable via the export endpoint.
  • hard delete → Row irrecoverably removed; storage reclaimed.
  • I-11 → “Retention has bounded grace and is operator-explainable per row” — the invariant this module satisfies.

Q&A

Q: Why split into pure classifier + external job? A: The classifier is unit-testable without a DB; the job carries the side effects. Same split as diagnostics.py — pure functions over rows.

Q: What if a customer hits export 3 days after soft-delete? A: The export endpoint serves the run because it’s still within the 7-day hard-delete grace. Day 8 the row is gone.

Q: Can BETA_RETENTION_DAYS differ per tenant? A: Not in beta — the constant is global. Future plans (billing.py::TENANT_PLAN_*) introduce per-plan retention; the classifier API already takes the tenant’s days as an argument so the module doesn’t need to change.

Examples

Run started 2026-02-04, today 2026-05-05 (90 days). compute_expiry → 2026-05-05. classify_run returns soft_delete. The nightly job sets deleted_at = now. A scheduled weekly job, on 2026-05-12, calls the classifier again, gets hard_delete, and removes the row.

neighbors on the map