CRUMB a card from devarno-cloud

Protobuf BaseEvent Envelope

skyflow intermediate 5 min read

ELI5

Every Skyflow event is a protobuf message that begins with the same nine-field “envelope” called BaseEvent, like the address sticker on a parcel. The envelope tells you who the event is about (aggregate_id), why it was sent (causation_id), and how to follow the trail across services (correlation_id). The actual payload sits next to the envelope as additional fields.

Technical Deep Dive

BaseEvent Definition

From proto/skyflow/events/v1/user.proto:

#FieldTypePurpose
1event_idstring (UUID)Idempotency key — consumers dedupe on this
2event_typestringFully-qualified type, e.g. "skyflow.events.v1.XPEarnedEvent"
3timestampgoogle.protobuf.TimestampWall-clock at producer
4aggregate_idstringEntity ID this event mutates (user_id, subscription_id, …)
5aggregate_typestringEntity type (User, Subscription, …)
6versionint32Event schema version
7correlation_idstringRequest trace ID — same across all events from one user request
8causation_idstringevent_id of the parent event that caused this one
9metadatagoogle.protobuf.StructFree-form additional context

Composition Pattern

Concrete events embed BaseEvent as field #1 and add domain fields. Example:

message XPEarnedEvent {
BaseEvent base = 1;
string user_id = 2;
int32 xp_delta = 3;
int32 xp_total = 4;
int32 xp_multiplier = 5;
string reason = 6;
google.protobuf.Struct context = 7;
bool bonus = 8;
string bonus_reason = 9;
}

Note: user_id is duplicated outside BaseEvent.aggregate_id for ergonomics — both should agree.

Class View

classDiagram
class BaseEvent {
+string event_id
+string event_type
+Timestamp timestamp
+string aggregate_id
+string aggregate_type
+int32 version
+string correlation_id
+string causation_id
+Struct metadata
}
class XPEarnedEvent { +BaseEvent base; +string user_id; +int32 xp_delta; ... }
class TimelineAnalyzedEvent { +BaseEvent base; +string analysis_id; ... }
class SubscriptionCreatedEvent { +BaseEvent base; +Tier tier; ... }
class UserRegisteredEvent { +BaseEvent base; +string did; ... }
BaseEvent <|-- XPEarnedEvent
BaseEvent <|-- TimelineAnalyzedEvent
BaseEvent <|-- SubscriptionCreatedEvent
BaseEvent <|-- UserRegisteredEvent

Wire Layout

packet
0-15: "BaseEvent (tag 1, length-delimited)"
16-31: " event_id (UUID string)"
32-47: " event_type, timestamp"
48-63: " aggregate_id, aggregate_type"
64-79: " version (int32), correlation_id"
80-95: " causation_id, metadata (Struct)"
96-127: "Domain fields (tags 2..N)"

The whole message is proto.Marshal’d to bytes and shipped as the NATS message body. JetStream’s duplicate_window (e.g. 30s on GAMIFICATION) uses the NATS Nats-Msg-Id header — best practice is to set that header to BaseEvent.event_id so dedup works across producer retries.

Key Terms

  • Aggregate → DDD term: the entity whose state the event mutates. aggregate_id = its primary key.
  • Correlation ID → identifier shared by every event in a single user-initiated workflow (set once at the API edge)
  • Causation ID → the parent event’s event_id; lets you reconstruct the causal DAG (A → B → C)
  • Schema version → bumped when a non-additive change is made; consumers branch on it

Q&A

Q: How do consumers detect duplicate deliveries? A: They store the last N processed event_ids (or rely on JetStream duplicate_window plus the Nats-Msg-Id header set to event_id).

Q: When should version be bumped? A: When a field is removed, renumbered, or its semantics change. Adding an optional field at a new tag number is backwards-compatible per protobuf rules and does not require a bump.

Q: Why store both correlation_id and causation_id? A: correlation_id groups the entire workflow (“the user’s POST /timeline/analyze call”); causation_id reconstructs the parent-child chain so you can render a tree, not a flat list.

Q: What encoding crosses the wire? A: Binary protobuf (proto.Marshal). NATS treats the body as opaque bytes; consumers proto.Unmarshal with the matching generated type.

Examples

Imagine each event is a postcard. BaseEvent is the address block on the back: stamp number (event_id), addressee (aggregate_id), the holiday-trip code stamped on every postcard from this trip (correlation_id), and a reference to the postcard that prompted you to write this one (causation_id). The picture side is the domain payload.

neighbors on the map