Automerge IPLD CRDT Integration
weave advanced 7 min read
ELI5
AutomergeState is a recipe card box where each recipe card (Operation) has a unique barcode (MsgID). Merging two boxes just means combining them without duplicates. The box has a running checksum (state_hash) that changes whenever a card is added, so two boxes that merged everything will have the same checksum.
Technical Deep Dive
Class Diagram
classDiagramclass Operation { +MsgID id +u8 op_type +Vec~u8~ payload +Option~MsgID~ parent +u64 lamport +new(id, op_type, payload, parent, lamport) Operation +to_bytes() Vec~u8~}
class AutomergeState { +BTreeMap~MsgID Operation~ operations +[u8; 32] state_hash +u32 edit_count +new() AutomergeState +add_operation(op) Result +merge(other) Result +get_operation(id) Option~Operation~ +get_state() (u32, [u8;32]) +contains(id) bool +operation_count() usize}
AutomergeState --> OperationOperation --> MsgIDSource: mesh-node/src/crdt.rs.
Operation Serialisation (to_bytes layout)
[0..31] id bytes (32 bytes, MsgID)[32] op_type (1 byte)[33..] payload (variable)[+1] has_parent flag (0 or 1)[+32?] parent bytes (32 bytes, present only if has_parent = 1)[last 8] lamport (u64, little-endian)Source: crdt.rs lines 37–51.
Merge & State Hash Sequence
sequenceDiagramparticipant NodeA as AutomergeState Aparticipant NodeB as AutomergeState B
NodeA->>NodeA: add_operation(op1)NodeA->>NodeA: recompute_state_hash()NodeB->>NodeB: add_operation(op2)NodeB->>NodeB: recompute_state_hash()NodeA->>NodeA: merge(&NodeB)Note right of NodeA: For each op in B.operations:\n if not in A → add_operation(op)NodeA->>NodeA: recompute_state_hash() [called by add_operation]NodeA-->>NodeB: (state_hash now matches after B.merge(&A))state_hash Algorithm
recompute_state_hash() is NOT a standard cryptographic hash. It is an additive accumulator over BTreeMap iteration order:
for (i, (_id, op)) in self.operations.iter().enumerate() { let op_bytes = op.to_bytes(); for (j, &byte) in op_bytes.iter().enumerate() { hasher[(i + j) % 32] = hasher[(i + j) % 32].wrapping_add(byte); }}Implications:
- Not collision-resistant: different operation sets could produce the same hash.
- Deterministic:
BTreeMapiteration is sorted byMsgID(Ord-derived), so the same operation set always produces the same hash. - Intent: integrity fingerprint for quick equality checks, not a security primitive.
CRDT Merge Semantics
merge(&other) is a union merge: for each (id, op) in other.operations, if id is not in self.operations, call add_operation(op). Already-present operations are silently skipped (idempotent). This satisfies CRDT commutativity and idempotence properties.
Key Terms
- Operation → A single Automerge change:
id(MsgID),op_type(u8),payload, optionalparent,lamport - AutomergeState → CRDT document state: union of all received operations indexed by MsgID
- state_hash → 32-byte rolling accumulator over all operations in BTreeMap order; not cryptographically secure
- edit_count → Monotonically increasing counter of total operations ever added (including from merges)
- merge → Union merge: adds operations from
otherthat are absent inself; idempotent and commutative
Q&A
Q: Is state_hash suitable for use as a Byzantine integrity check?
A: No. The additive accumulator (wrapping_add) is not collision-resistant. Two different operation sets could produce the same hash. For Byzantine integrity, callers should use the MsgID set directly (compare operations.keys()) or apply BLAKE3 over the serialised operation set.
Q: What does op_type represent and where is it decoded?
A: op_type is a u8 that is serialised into to_bytes() but has no enum definition in crdt.rs. It is passed opaquely from the application layer — the CRDT integration layer treats it as an identifier tag for the caller to interpret.
Q: Does merge() guarantee both peers reach the same state_hash?
A: Yes, once both have merged all of each other’s operations. Since BTreeMap iterates in MsgID order (deterministic), and the accumulator formula is deterministic, any two AutomergeStates containing the same set of operations will compute identical state_hash values.
Examples
Round-trip merge test pattern (from crdt.rs lines 187–196):
state1.add_operation(create_test_op(4, 0, None)).unwrap(); // op id [4;32]state2.add_operation(create_test_op(5, 0, None)).unwrap(); // op id [5;32]state1.merge(&state2).unwrap();assert_eq!(state1.operation_count(), 2);After state2.merge(&state1), both states have operations {[4;32], [5;32]} and identical state_hash. The hash equality check in the determinism test (crdt.rs lines 177–183) confirms this property.
neighbors on the map
- LCB DAG State Machine debugging why a message is stuck in the DAG and never delivered to the application
- LCBMessage & DAGNode Schema adding a new field to LCBMessage and need to update all serialisation sites
- FNP CRDT Conflict-Free Merge Semantics understanding CRDT properties and guarantees
- CRDT Operation Message adding a new operation type
- Slash Commands as CRDT Operations adding a new slash command to the editor palette