This EIP introduces a new Simple Serialize (SSZ) type to represent containers with forward-compatible Merkleization: A given field is always assigned the same stable generalized index (gindex) even when different container versions append new fields or drop existing fields.
Motivation
SSZ containers are frequently versioned, for example across fork boundaries. When the number of fields reaches a new power of two, or a field is removed or replaced with one of a different type, the shape of the underlying Merkle tree changes, breaking verifiers of Merkle proofs for these containers. Deploying a new verifier may involve security councils to upgrade smart contract logic, or require firmware updates for embedded devices. This effort is needed even when no semantic changes apply to the fields that the verifier is interested in.
Further, if multiple versions of an SSZ container coexist at the same time, for example to represent transaction profiles, the same field may be assigned to a different gindex in each version. This unnecessarily complicates verifiers and introduces a maintenance burden, as the verifier has to be kept up to date with version specific field to gindex map.
Progressive containers address these shortcomings by:
Using the progressive Merkle tree structure to progressively grow to the actual field count with minimal overhead, ensuring provers remain valid as the field count chanages.
Assigning stable gindices for each field across all versions by allowing gaps in the Merkle tree where a field is absent.
Serializing in a compact form where absent fields do not consume space.
Specification
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.
progressive container: ordered heterogeneous collection of values with custom Merkleization
python dataclass notation with key-type pairs, e.g.
classSquare(ProgressiveContainer[active_fields=[1,0,1]]):side:uint16# Merkleized at field index #0 (location of first 1 in `active_fields`)
color:uint8# Merkleized at field index #2 (location of second 1 in `active_fields`)
classCircle(ProgressiveContainer[active_fields=[0,1,1]]):radius:uint16# Merkleized at field index #1 (location of first 1 in `active_fields`)
color:uint8# Merkleized at field index #2 (location of second 1 in `active_fields`)
compatible union: union type containing one of the given subtypes with compatible Merkleization
notation CompatibleUnion[type_0, type_1, ...], e.g. CompatibleUnion[Square, Circle]
Compatible unions are always considered “variable-size”, even when all type options share the same fixed length.
ProgressiveContainer with an active_fields list ending in 0 are illegal.
ProgressiveContainer with an active_fields list of more than 256 entries are illegal.
ProgressiveContainer with an active_fields list with a different count of 1 than fields are illegal.
CompatibleUnion[type_0, type_1, ...] without any type options are illegal.
CompatibleUnion[type_0, type_1, ...] without a ProgressiveContainer[active_fields=[]] type option and more than 127 type options are illegal.
CompatibleUnion[type_0, type_1, ...] with a ProgressiveContainer[active_fields=[]] type option and more than 128 type options are illegal.
CompatibleUnion[type_0, type_1, ...] with a ProgressiveContainer[active_fields=[]] type option at a position other than type_0 are illegal.
CompatibleUnion[type_0, type_1, ...] with a type option that has incompatible Merkleization with another type option are illegal.
Compatible Merkleization
Types are compatible with themselves.
byte is compatible with uint8 and vice versa.
Bitlist[N] / Bitvector[N] are compatible if they share the same capacity N.
List[type, N] / Vector[type, N] are compatible if type is compatible and they share the same capacity N.
ProgressiveList[type] are compatible if type is compatible.
Container are compatible if they share the same field names in the same order, and all field types are compatible.
ProgressiveContainer[active_fields] are compatible if all 1 entries in both type’s active_fields correspond to fields with shared names and compatible types, and no other field name is shared across both types.
CompatibleUnion[type_0, type_1, ...] and any of the type options type_0, type_1, … are compatible and vice versa.
CompatibleUnion are compatible with each other if all type options across both CompatibleUnion are compatible.
All other types are incompatible.
Serialization
Serialization of ProgressiveContainer[active_fields] are identical to Container.
A value as CompatibleUnion[T...] type has properties value.data with the contained value, and value.selector which indexes the selected CompatibleUnion type option T. The index is based at 0 if a ProgressiveContainer[active_fields=[]] type option is present. Otherwise, it is based at 1.
Deserialization of ProgressiveContainer[active_fields] is identical to Container.
For CompatibleUnion, the deserialization logic is updated:
In the case of compatible unions, the first byte of the deserialization scope is deserialized as type selector, the remainder of the scope is deserialized as the selected type.
The following invalid input needs to be hardened against:
An out-of-bounds type selector in a CompatibleUnion
get_active_fields: Given a value of type ProgressiveContainer[active_fields] return value.__class__.active_fields.
mix_in_active_fields: Given a Merkle root root and an active_fields bitlist return hash(root, pack_bits(active_fields)). Note that active_fields is restricted to ≤ 256 bits.
The Merkleization definitions are extended.
mix_in_active_fields(merkleize_progressive([hash_tree_root(element) for element in value]), get_active_fields(value)) if value is a progressive container.
hash_tree_root(value.data) if value is of compatible union type. Note that the selector is only used for serialization, it is not mixed in when Merkleizing.
Rationale
Why active_fields?
active_fields conceptually creates a single Merkle tree shape across all prior, current, and future versions, where each field (as identified by its name) is assigned a stable gindex and can either be present or absent. This allows appending new fields or dropping existing fields as the data type evolves without impacting hash_tree_root, reducing the maintenance burden for provers and verifiers of partial data.
Why CompatibleUnion?
SSZ does not allow skipping of unknown fields when deserializing data, requiring the full schema to be known and preventing a forward compatible serialization format. Each specification version only allows select combinations of active_fields at any time. Using CompatibleUnion reduces serialization overhead (especially when nesting is involved), and decreases the number of deserialization error conditions.
Note that hash_tree_root is unaffected by any given serialization. For Merkleization and proof verification, active_fields models the underlying data as a Container full of optional fields. When only partial data is desired (e.g., a staking pool that wishes to verify if a validator got slashed), a forward compatible proof format can be designed on a per application basis.
Further note that CompatibleUnion is solely a serialization optimization. The CompatibleUnion selector is not mixed into hash_tree_root. Instead, the underlying active_fields are mixed in.
Why not Optional[type]?
Introducing Optional is not required by any current functionality, and would trigger serialization overhead by requiring “variable-size” encoding for all enclosing types.
If Optional support becomes desirable, and it is not feasible to model the valid field combinations as a CompatibleUnion, the active_fields concept should be reused by updating get_active_fields to only emit 1 for fields that are always required as well as for optional fields that are present, and emit 0 for fields that are always excluded as well as for optional fields that are absent, retaining compatibility across the proposed ProgressiveContainer[active_fields] type. As for serialization, a BitVector[O] with O being the number of optional fields would be prepended to the value to indicate presence or absence of optional fields. The concept can also be combined with CompatibleUnion, with individual type options having optional fields.
Optional[type] should not be represented as a CompatibleUnion[ProgressiveContainer[active_fields=[]], type] because of Merkleization and serialization overhead over the proposed update.
Backwards Compatibility
ProgressiveContainer[active_fields] is a new SSZ type and does not conflict with existing types.
CompatibleUnion[type_0, type_1, ...] conflicts with an earlier union proposal. However, it has only been used in deprecated specifications. Portal network also uses a union concept for network types, but does not use hash_tree_root on them, and could transition to the new compatible union proposal with a new networking version.