Alert Source Discuss
⚠️ Review Standards Track: Core

EIP-7495: SSZ StableContainer

New SSZ type to represent a flexible container with stable serialization and merkleization

Authors Etan Kissling (@etan-status)
Created 2023-08-18

Abstract

This EIP introduces a new Simple Serialize (SSZ) type to represent StableContainer[N] values.

A StableContainer[N] is an SSZ Container with stable serialization and merkleization even when individual fields become optional or new fields are introduced in the future.

Motivation

Stable containers are currently not representable in SSZ. Adding support provides these benefits:

  1. Stable signatures: Signing roots derived from a StableContainer[N] never change. In the context of Ethereum, this is useful for transaction signatures that are expected to remain valid even when future updates introduce additional transaction fields. Likewise, the overall transaction root remains stable and can be used as a perpetual transaction ID.

  2. Stable merkle proofs: Merkle proof verifiers that check specific fields of a StableContainer[N] do not need continuous updating when future updates introduce additional fields. Common fields always merkleize at the same generalized indices.

  3. Optional fields: Current SSZ formats do not support optional fields, prompting designs to use zero values instead. With StableContainer[N], the SSZ serialization is compact; inactive fields do not consume space.

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Type definition

Similar to the regular SSZ Container, StableContainer[N] defines an ordered heterogeneous collection of fields. N indicates the potential maximum number of fields to which it can ever grow in the future. N MUST be > 0.

As part of a StableContainer[N], fields of type Optional[T] MAY be defined. Such fields can either represent a present value of SSZ type T, or indicate absence of a value (indicated by None). The default value of an Optional[T] is None.

class Example(StableContainer[32]):
    a: uint64
    b: Optional[uint32]
    c: uint16

For the purpose of serialization, StableContainer[N] is always considered “variable-size” regardless of the individual field types.

Stability guarantees

The serialization and merkleization of a StableContainer[N] remains stable as long as:

  • The maximum capacity N does not change
  • The order of fields does not change
  • New fields are always added to the end
  • Required fields remain required T, or become an Optional[T]
  • Optional fields remain Optional[T], or become a required T

When an optional field becomes required, existing messages still have stable serialization and merkleization, but will be rejected on deserialization if not present.

Serialization

Serialization of StableContainer[N] is defined similarly to the existing logic for Container. Notable changes are:

  • A Bitvector[N] is constructed, indicating active fields within the StableContainer[N]. For required fields T and optional fields Optional[T] with a present value (not None), a True bit is included. For optional fields Optional[T] with a None value, a False bit is included. The Bitvector[N] is padded with False bits up through length N
  • Only active fields are serialized, i.e., fields with a corresponding True bit in the Bitvector[N]
  • The serialization of the Bitvector[N] is prepended to the serialized active fields
  • If variable-length fields are serialized, their offsets are relative to the start of serialized active fields, after the Bitvector[N]
def is_active_field(element):
    return not is_optional(element) or element is not None

# Determine active fields
active_fields = Bitvector[N](([is_active_field(element) for element in value] + [False] * N)[:N])
active_values = [element for element in value if is_active_field(element)]

# Recursively serialize
fixed_parts = [serialize(element) if not is_variable_size(element) else None for element in active_values]
variable_parts = [serialize(element) if is_variable_size(element) else b"" for element in active_values]

# Compute and check lengths
fixed_lengths = [len(part) if part != None else BYTES_PER_LENGTH_OFFSET for part in fixed_parts]
variable_lengths = [len(part) for part in variable_parts]
assert sum(fixed_lengths + variable_lengths) < 2**(BYTES_PER_LENGTH_OFFSET * BITS_PER_BYTE)

# Interleave offsets of variable-size parts with fixed-size parts
variable_offsets = [serialize(uint32(sum(fixed_lengths + variable_lengths[:i]))) for i in range(len(active_values))]
fixed_parts = [part if part != None else variable_offsets[i] for i, part in enumerate(fixed_parts)]

# Return the concatenation of the active fields `Bitvector` with the active
# fixed-size parts (offsets interleaved) and the active variable-size parts
return serialize(active_fields) + b"".join(fixed_parts + variable_parts)

Deserialization

Deserialization of a StableContainer[N] starts by deserializing a Bitvector[N]. That value MUST be validated:

  • For each required field, the corresponding bit in the Bitvector[N] MUST be True
  • For each optional field, the corresponding bit in the Bitvector[N] is not restricted
  • All extra bits in the Bitvector[N] that exceed the number of fields MUST be False

The rest of the data is deserialized same as a regular SSZ Container, consulting the Bitvector[N] to determine what optional fields are present in the data. Absent fields are skipped during deserialization and assigned None values.

Merkleization

The merkleization specification is extended with the following helper functions:

  • chunk_count(type): calculate the amount of leafs for merkleization of the type.
    • StableContainer[N]: always N, regardless of the actual number of fields in the type definition
  • mix_in_aux: Given a Merkle root root and an auxiliary SSZ object root aux return hash(root + aux).

To merkleize a StableContainer[N], a Bitvector[N] is constructed, indicating active fields within the StableContainer[N], using the same process as during serialization.

Merkleization hash_tree_root(value) of an object value is extended with:

  • mix_in_aux(merkleize(([hash_tree_root(element) if is_active_field(element) else Bytes32() for element in value.data] + [Bytes32()] * N)[:N]), hash_tree_root(value.active_fields)) if value is a StableContainer[N].

Variant[S]

Variant[S] serves as a subset of StableContainer S. While S still determines how the Variant[S] is merkleized, Variant[S] MAY implement additional restrictions on valid combinations of fields and serialization is optimized for a more compact representation based on these restrictions.

  • Fields in Variant[S] may have a different order than in S; this only affects serialization, the canonical order in S is always used for merkleization
  • Fields in Variant[S] may be required, despite being optional in S
  • Fields in Variant[S] may be missing, despite being optional in S
  • All fields that are required in S must be present in Variant[S]

Serialization of a specific Variant follows a similar scheme as the one for its underlying StableContainer, except that the leading Bitvector is replaced by a sparse representation that only indicates presence or absence of optional fields Optional[T]. Bits for required fields as well as trailing padding bits are not included when serializing Variant[S]. If there are no optional fields, the entire Bitvector is omitted. While this serialization is more compact, note that it is not forward compatible and that context information that determines the underlying data type has to be indicated out of bands. If forward compatibility is required, the Variant’s underlying data SHALL be serialized as defined by the underlying StableContainer.

Variant[S] is considered “variable-size” iff it contains any Optional[T] or any “variable-size” fields.

# Defines the common merkleization format and a portable serialization format across variants
class Shape(StableContainer[4]):
    side: Optional[uint16]
    color: uint8
    radius: Optional[uint16]

# Inherits merkleization format from `Shape`, but is serialized more compactly
class Square(Variant[Shape]):
    side: uint16
    color: uint8

# Inherits merkleization format from `Shape`, but is serialized more compactly
class Circle(Variant[Shape]):
    radius: uint16
    color: uint8

In addition, OneOf[S] is defined to provide a select_variant helper function for determining the Variant[S] to use when parsing StableContainer S. The select_variant helper function MAY incorporate environmental information, e.g., the fork schedule.

class AnyShape(OneOf[Shape]):
    @classmethod
    def select_variant(cls, value: Shape, circle_allowed = True) -> Type[Shape]:
        if value.radius is not None:
            assert circle_allowed
            return Circle
        if value.side is not None:
            return Square
        assert False

Rationale

What are the problems solved by StableContainer[N]?

Current SSZ types are only stable within one version of a specification, i.e., one fork of Ethereum. This is alright for messages pertaining to a specific fork, such as attestations or beacon blocks. However, it is a limitation for messages that are expected to remain valid across forks, such as transactions or receipts. In order to support evolving the features of such perpetually valid message types, a new SSZ scheme needs to be defined. Furthermore, consumers of Merkle proofs may have a different software update cadence as Ethereum; an implementation should not break just because a new fork introduces unrelated new features.

To avoid restricting design space, the scheme has to support extension with new fields, obsolescence of old fields, and new combinations of existing fields. When such adjustments occur, old messages must still deserialize correctly and must retain their original Merkle root.

What are the problems solved by Variant[S]?

The forward compatible merkleization of StableContainer may be desirable even in situations where only a single variant is valid at any given time, e.g., as determined by the fork schedule. In such situations, message size can be reduced and type safety increased by exchanging Variant[S] instead of the underlying StableContainer. This can be useful, e.g., for consensus data structures such as BeaconState, to ensure that Merkle proofs for its fields remain compatible across forks.

Why not Union[T, U, V]?

Typically, the individual Union cases share some form of thematic overlap, sharing certain fields with each other. In a Union, shared fields are not necessarily merkleized at the same generalized indices. Therefore, Merkle proof systems would have to be updated each time that a new flavor is introduced, even when the actual changes are not of interest to the particular system.

Furthermore, SSZ Union types are currently not used in any final Ethereum specification and do not have a finalized design themselves. The StableContainer[N] serializes very similar to current Union[T, U, V] proposals, with the difference being a Bitvector[N] as a prefix instead of a selector byte. This means that the serialized byte lengths are comparable.

Why not a Container full of Optional[T]?

If Optional[T] is modeled as an SSZ type, each individual field introduces serialization and merkleization overhead. As an Optional[T] would be required to be “variable-size”, lots of additional offset bytes would have to be used in the serialization. For merkleization, each individual Optional[T] would require mixing in a bit to indicate presence or absence of the value.

Additionally, every time that the number of fields reaches a new power of 2, the Merkle roots break, as the number of chunks doubles. The StableContainer[N] solves this by artificially extending the Merkle tree to N chunks regardless of the actual number of fields currently specified. Because N is constant across specification versions, the Merkle tree shape remains stable. The overhead of the additional empty placeholder leaves only affects serialization of the Bitvector[N] (1 byte per 8 leaves); the number of required hashes during merkleization only grows logarithmically with N.

Backwards Compatibility

StableContainer[N] and Variant[S] are new SSZ types and do not conflict with other SSZ types currently in use.

Test Cases

See EIP assets.

Reference Implementation

See EIP assets, based on protolambda/remerkleable.

Security Considerations

None

Copyright and related rights waived via CC0.

Citation

Please cite this document as:

Etan Kissling (@etan-status), "EIP-7495: SSZ StableContainer [DRAFT]," Ethereum Improvement Proposals, no. 7495, August 2023. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-7495.