Introduce a new binary state tree, intended to replace the hexary patricia trees. Account and storage tries are merged into a single tree with 32-byte keys, which also contains contracts code. Account data is broken into independent leaves which are grouped by 256 in order to provide some locality.
Note: the hash function used in the current draft is not final, other potential candidates are Keccak or Poseidon2. For the latter, an extra section on how to encode 32-bytes->[FiniteField] will be created. Do not assume Blake3 is a final decision. <– TODO finalize hash function –>
Motivation
Ethereum’s long-term goal is to allow blocks to be proved with validity proof so that chain verification is as simple and fast as possible. One of the most challenging parts of achieving that goal is proving the state of the tree, which is required for EVM execution.
The current Merkle-Patricia Trie (MPT) design isn’t friendly for validity proofs for many reasons, such as using RLP for node encoding, Keccak as a hashing function, being a “tree of trees”, and not including accounts code as part of the state.
Apart from requiring a state tree design that is friendly for block validity proofs, it’s also important to have fast and small regular Merkle proofs when the amount of state to prove is small. This is useful not only for the application layer but also for supporting future protocol needs in a stateless world (e.g: inclusion lists).
Regarding these regular Merkle proofs in an MPT, since it’s a hexary tree, their size is quite big on average and in worst-case scenarios. Given a 2^32 size tree, the expected size for proving a single branch is 15 * 32 * log_16(2^32) = 3840. From a worst-case block perspective, if 30M gas is used to access only a single byte of different account codes since this code isn’t chunkified, we’d need 30M/2400*(5*480+24k)=330MB.
A binary tree has a good balance between out-of-circuit and in-circuit proving. Since the tree arity is two, this reduces the size of regular Merkle proofs (i.e., # sibilings * log_arity(N) for arity 2 is better than for arity 16). Moreover, we propose switching from Keccak to Blake3, which has proven security and is more amenable to modern proving systems.
Specification
The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.
Notable changes from the hexary structure
The account and storage tries are merged into a single trie.
RLP is no longer used.
The account’s code is chunked and included in the tree.
Account data (e.g., balance, nonce, first storage slots, first code-chunks) is co-located in the tree to reduce branch openings.
Tree structure
The proposed Binary Tree stores key-value entries where both the key and value are 32 bytes. The first 31-bytes of the key define the entry stem, and the last byte is the subindex in that stem. If two keys have the same stem, they live in the same “big branch” — this co-locates groups of 256 keys (i.e: keys with the same first 31-bytes).
(More details about the purple/orange part in the “Tree embedding” section)
We can distinguish four node types:
InternalNode has a left_hash and right_hash field (black).
StemNode has fields stem, left_hash and right_hash (blue).
LeafNode has a field value which is a 32-byte blob or empty. (orange).
EmptyNode represents an empty node/sub-tree.
The path to a StemNode is defined by the key’s first 248-bits (31-bytes) from MSB to LSB. From this node, a subtree of 256 values exists indexed by the key’s last byte (8 bits). A newly created StemNode subtree (i.e: 256 values) has all leaves with value: empty.
The path length isn’t 248-bits, but contains the minimal amount of InternalNodes required if other stems exist with the same prefix. As in, when walking the 248-bit path inserting a new StemNode, you can end on a EmptyNode or StemNode. In the former, you replace it with the new StemNode. In the latter, you create the as many InternalNodes as common bits both stems have as perfixes. Another way to look at this is that there aren’t extension nodes in paths, but we use the minimal amount of InternalNodes to place StemNodes with common prefixes.
Below is an implementation that describes the tree structure:
classStemNode:def__init__(self,stem:bytes):assertlen(stem)==31,"stem must be 31 bytes"self.stem=stemself.values:list[Optional[bytes]]=[None]*256defset_value(self,index:int,value:bytes):self.values[index]=valueclassInternalNode:def__init__(self):self.left=Noneself.right=NoneclassBinaryTree:def__init__(self):self.root=Nonedefinsert(self,key:bytes,value:bytes):assertlen(key)==32,"key must be 32 bytes"assertlen(value)==32,"value must be 32 bytes"stem=key[:31]subindex=key[31]ifself.rootisNone:self.root=StemNode(stem)self.root.set_value(subindex,value)returnself.root=self._insert(self.root,stem,subindex,value,0)def_insert(self,node,stem,subindex,value,depth):assertdepth<248,"depth must be less than 248"ifnodeisNone:node=StemNode(stem)node.set_value(subindex,value)returnnodestem_bits=self._bytes_to_bits(stem)ifisinstance(node,StemNode):# If the stem already exists, update the value.
ifnode.stem==stem:node.set_value(subindex,value)returnnodeexisting_stem_bits=self._bytes_to_bits(node.stem)returnself._split_leaf(node,stem_bits,existing_stem_bits,subindex,value,depth)# We're in an internal node, go left or right based on the bit.
bit=stem_bits[depth]ifbit==0:node.left=self._insert(node.left,stem,subindex,value,depth+1)else:node.right=self._insert(node.right,stem,subindex,value,depth+1)returnnodedef_split_leaf(self,leaf,stem_bits,existing_stem_bits,subindex,value,depth):# If the stem bits are the same up to this depth, we need to create another
# internal node and recurse.
ifstem_bits[depth]==existing_stem_bits[depth]:new_internal=InternalNode()bit=stem_bits[depth]ifbit==0:new_internal.left=self._split_leaf(leaf,stem_bits,existing_stem_bits,subindex,value,depth+1)else:new_internal.right=self._split_leaf(leaf,stem_bits,existing_stem_bits,subindex,value,depth+1)returnnew_internalelse:new_internal=InternalNode()bit=stem_bits[depth]stem=self._bits_to_bytes(stem_bits)ifbit==0:new_internal.left=StemNode(stem)new_internal.left.set_value(subindex,value)new_internal.right=leafelse:new_internal.right=StemNode(stem)new_internal.right.set_value(subindex,value)new_internal.left=leafreturnnew_internal
Node merkelization
Define hash(value) as:
hash([0x00] * 64) = [0x00] * 32
hash(value) = blake3(value), otherwise.
The value parameter lengths can only be 32 and 64.
def_hash(self,data):ifdatain(None,b"\x00"*64):returnb"\x00"*32assertlen(data)==64orlen(data)==32,"data must be 32 or 64 bytes"returnblake3(data).digest()defmerkelize(self):def_merkelize(node):ifnodeisNone:returnb"\x00"*32ifisinstance(node,InternalNode):left_hash=_merkelize(node.left)right_hash=_merkelize(node.right)returnself._hash(left_hash+right_hash)level=[self._hash(x)forxinnode.values]whilelen(level)>1:new_level=[]foriinrange(0,len(level),2):new_level.append(self._hash(level[i]+level[i+1]))level=new_levelreturnself._hash(node.stem+b"\0"+level[0])return_merkelize(self.root)
Tree embedding
Instead of a two-layer structure like the MPT, we will embed all information into a unique key: value space in the proposed single tree. This section specifies which tree keys store the state’s information (account header data, code, storage). The data is colocated in such way that data that is usually accessed together lives in the same StemNode which reduces branch openings.
The following is a big picture of what we’ll continue to describe in the rest of this section:
The account stem (green account node) contains accounts basic data, first 64-storage slots, and first 128-code chunks. This co-location of data allows to have in single stem branch data that is usually accessed together. The rest of storage slots and code-chunks are spread into groups of 256 values in the rest of the tree – this is also done for convenience since slots/code-chunks that are close together will share the same stem branch.
Parameter
Value
BASIC_DATA_LEAF_KEY
0
CODE_HASH_LEAF_KEY
1
HEADER_STORAGE_OFFSET
64
CODE_OFFSET
128
STEM_SUBTREE_WIDTH
256
MAIN_STORAGE_OFFSET
256**31
It’s a required invariant that STEM_SUBTREE_WIDTH > CODE_OFFSET > HEADER_STORAGE_OFFSET and that HEADER_STORAGE_OFFSET is greater than the leaf keys. Additionally, MAIN_STORAGE_OFFSET MUST be a power of STEM_SUBTREE_WIDTH.
Note that addresses are always passed around as an Address32. To convert existing addresses to Address32, prepend with 12 zero bytes:
An account’s version, balance, nonce, and code_size fields are packed with big-endian encoding in the value found at BASIC_DATA_LEAF_KEY:
Name
Offset
Size
version
0
1
code_size
5
3
nonce
8
8
balance
16
16
Bytes 1..4 are reserved for future use.
The current layout and encoding allow for an extension of code_size to 4 bytes without changing the account version. The goal of packing these fields together in a basic-data leaf is to reduce gas costs, since we only need one branch opening compared with usual 3 or 4 if the fields aren’t packed. This also simplifies witness generation.
When any account header field is set, the version field is also set to zero. The codehash and code_size fields are set upon contract or EOA creation.
Chunk i stores a 32 byte value, where bytes 1…31 are bytes i*31...(i+1)*31 - 1 of the code (i.e. the i’th 31-byte slice of it), and byte 0 is the number of leading bytes that are part of PUSHDATA (e.g. if part of the code is ...PUSH4 99 98 | 97 96 PUSH1 128 MSTORE... where | is the position where a new chunk begins, then the encoding of the latter chunk would begin 2 97 96 PUSH1 128 MSTORE to reflect that the first 2 bytes are PUSHDATA).
For precision, here is an implementation of code chunkification:
PUSH_OFFSET=95PUSH1=PUSH_OFFSET+1PUSH32=PUSH_OFFSET+32defchunkify_code(code:bytes)->Sequence[bytes32]:# Pad to multiple of 31 bytes
iflen(code)%31!=0:code+=b'\\x00'*(31-(len(code)%31))# Figure out how much pushdata there is after+including each byte
bytes_to_exec_data=[0]*(len(code)+32)pos=0whilepos<len(code):ifPUSH1<=code[pos]<=PUSH32:pushdata_bytes=code[pos]-PUSH_OFFSETelse:pushdata_bytes=0pos+=1forxinrange(pushdata_bytes):bytes_to_exec_data[pos+x]=pushdata_bytes-xpos+=pushdata_bytes# Output chunks
return[bytes([min(bytes_to_exec_data[pos],31)])+code[pos:pos+31]forposinrange(0,len(code),31)]
Note that storage slots in the same size STEM_SUBTREE_WIDTH range (i.e. a range with the form x*STEM_SUBTREE_WIDTH ... (x+1)*STEM_SUBTREE_WIDTH-1) are all, except for the HEADER_STORAGE_OFFSET special case, part of a single stem.
This EIP defines a new Binary Tree that starts empty. Only new state changes are stored in the tree. The MPT continues to exist but is frozen. This sets the stage for a future hard fork that migrates the MPT data to this Binary Tree (EIP-7748).
Single tree design
The proposal uses a single-layer tree structure with 32-byte keys and values for several reasons:
Simplicity: working with the abstraction of a key/value store makes it easier to write code dealing with the tree (e.g. database reading/writing, caching, syncing, proof creation, and verification) and upgrade it to other trees in the future. Additionally, witness gas rules can become simpler and clearer.
Uniformity: the state is uniformly spread throughout the tree; even if a single contract has millions of storage slots, the contract’s storage slots are not concentrated in one place. This is useful for state-syncing algorithms. Additionally, it helps reduce the effectiveness of unbalanced tree-filling attacks.
Extensibility: account headers and code being in the same structure as storage makes it easier to extend both features and even add new structures if desired.
SNARK friendliness and Post-Quantum security
The proposed design should consider the usual complexity, performance, and efficiency for out-of-circuit implementations (i.e. EL clients) and in-circuit ones for generating proofs in ZK circuits.
The tree design structure tries to be simple, by not having complex branching rules. For example, in contrast with the MPT, we avoid extension nodes injected in the middle of branches. Also, RLP encoding was removed since it adds unnecessary complexity. Although complexity can be managed more easily in out-of-circuit implementations, it’s valuable to help circuit implementations be as simple as possible to avoid proving overhead.
The most important factor in the design is the cryptographic hash function used for merkelization. This hash function should have efficient implementations both out and in circuits. The proposed BLAKE3 hash function was chosen since it provides a good balance between:
Out-of-circuit performance (i.e. for raw EL client execution)
In circuit performance (i.e. for proving in a validity proof)
Proven security
Poseidon2 was considered to perform well in the first two points, but further research is still needed to ensure its security.
Due to recent progress in the quantum computing field, experts estimations consider that they can become real in the 2030s. NIST suggests stop using ECC by 2030 too. Other alternatives, such as Verkle Trees, introduce a new cryptography stack to the protocol, which relies on elliptic curves that aren’t post-quantum secure. This makes the current tree proposal attractive since it only depends on hash functions, which are still safe in this new paradigm.
Moreover, there has been impressive progress in proving systems, which indicates that we could be close to achieving the required performance for creating pre-state & post-state proofs fast enough. This last point is crucial since the main advantage of Verkle Trees was having proofs that are small and fast to generate.
Finally, the current state tree proposal will probably be the final state tree used in the protocol, compared with Verkle Trees, which would eventually need to be replaced by a post-quantum secure alternative.
Arity-2
Binary tries have been chosen primarily because they reduce the witness size. In general, in an N-element tree with each element having k children, the average size of a branch is roughly 32 * (k-1) * log(N) / log(k) plus a few percent for overhead. 32 is the length of a hash; the k-1 refers to the fact that a Merkle proof needs to provide all k-1 sister nodes at each level, and log(N) / log(k) is an approximation of the number of levels in the tree (e.g. a tree where each node has 5 children, with 625 nodes total, would have depth 4, as 625 = 5**4 and so log(625) / log(5) = 4).
For any N, the expression is minimized at k = 2. Here’s a table of branch lengths for different k values assuming N = 2**24:
k (children per node)
Branch length (chunks)
Branch length (bytes)
2
1 * 24 = 24
768
4
3 * 12 = 36
1152
8
7 * 8 = 56
1792
16
15 * 6 = 90
2880
Actual branch lengths might be slightly larger than this due to uneven distribution, but the pattern of k=2 remains by far the best.
Tree depth
The tree design attempts to be as simple as possible considering both out-of-circuit and circuit implementations, while satisfying our throughput constraints on proving hashes.
The proposed design avoids a full 248-bit depth as it would happen in a Sparse Merkle Tree (SMT). Since we don’t use arithmetic hashes (e.g. Poseidon2), this is required to be aggressive in avoiding hashing load in proving systems which today is quite tight on throughput in commodity hardware. There are some tricks that could be used to mitigate this, but this also adds extra complexity to the spec.
Moreover, we could push this further trying to introduce extension nodes in middle of paths as done in a MPT, but this also adds complexity that might not worth it considering the tree should be quite balanced.
State-expiry
State-expiry strategies such as EIP-7736 could still be applied, requiring a change in the design. One potential solution is adding a field the StemNode with epoch as described in the mentioned EIP. Another alternative is to use 247-bits for the stem, and have two subtrees StemValuesNode, which would correspond to the current 256-values, and StemMetaNode which is also a 256-subtree that can be used to store arbitrary stem metadata.
Backwards Compatibility
The main breaking changes are:
(1) Gas costs for code chunk access can have an impact on applications’ economic viability
(2) Tree structure change makes in-EVM proofs of historical state no longer work
(1) can be mitigated by increasing the gas limit while implementing this EIP, allowing the same economic viability for applications.