This EIP proposes the creation of a new subprotocol, etha, enabling Ethereum nodes to communicate available block spans via a bitmask. Each bit represents a 100_000-block span within each 1_000_000 block range of chain history. Nodes use this bitmask to signal stored spans and commit to storing future spans as they are created. This allows peers to make informed decisions about data availability without first connecting and querying for it. The bitmask repeats every 1 million blocks for straightforward reasoning about data availability probabilities.
The etha subprotocol has the same functionality to serve historical data using message types copied from the eth protocol, enabling efficient data retrieval.
Motivation
With EIP-4444, nodes may prune historical data while others continue serving it. Determining data availability by connecting and requesting blocks is inefficient consuming unnexessary bandwidth. This EIP addresses this inefficiency by enabling nodes to shard chain history into 100_000 block segments and signal availability via a bitmask.
By introducing a separate subprotocol, etha, nodes can exchange this information seamlessly and retain the ability to serve historical data without impacting existing eth protocol versions.
Specification
Subprotocol Handshake
Introduce a new subprotocol named etha.
Define the handshake message for the etha subprotocol as follows:
blockBitmask is a 10-bit bitmask, with each bit representing a 100_000-block range per 1_000_000 blocks of history.
the rest of the elements are as defined in eth/69
Supported Messages
The etha subprotocol MUST include support for the following messages from the eth/69 protocol to facilitate historical data serving:
GetBlockBodies (0x05): Request block bodies.
BlockBodies (0x06): Response to GetBlockBodies.
GetReceipts (0x0f): Request receipts.
Receipts (0x10): Response to GetReceipts.
The semantics and payload structures for these messages are identical to their counterparts in the eth/69 protocol, ensuring compatibility for historical data serving.
Node Behavior
Bitmask Initialization: Nodes MAY set at least one bit in the blockBitmask to on upon startup and backfill the corresponding 100_000-block span.
Shard Retention Probability: Nodes MUST retain new block spans according to their bitmask, aiming to cover at least 10% of chain history.
Additional Shard Retention Buffer: Nodes SHOULD retain 1_000 blocks before and after shard boundaries to accommodate edge cases.
Commitment to Future Ranges: Nodes MUST retain spans corresponding to their advertised bitmask as new blocks are added.
Bitmask Space: The 100_000 range blockBitmask repeats every 1 million blocks, enabling efficient representation of historical data locality across epochs.
Upon connection using etha, nodes exchange the handshake message with the blockBitmask. This single handshake eliminates the need for additional message types.
ENR Extension
Alternatively, the blockBitmask could be derived or encoded into the Ethereum Node Record (ENR), enabling nodes to advertise block spans without a handshake. As an example, the blockBitmask can be derived from the secp256k1 field of the ENR. However, this method lacks the authentication and reliability of the handshake approach. Additionally, there is not guarantee that the node you are connecting to supports the etha subprotocol.
Rationale
The bitmask approach provides a flexible means to represent and retain block data while committing to future spans. This mechanism aligns with the pruning proposed in EIP-4444, while ensuring that historical and future data spans remain available across the network.
A similar bitlist approach is already used in the Consensus Layer for attestation subnets, making it a familiar and efficient method for representing data spans. Additionally, committing to future spans ensures better predictability and stability for data locality.
The etha subprotocol separates this functionality from eth ensuring nodes dont hammer other nodes with requests on historical ranges that they do not posses on the eth protocol.
Backwards Compatibility
The etha subprotocol is independent of the eth protocol.
This EIP does not affect the consensus engine or require a hard fork.
Security Considerations
There are some considerations:
Data unavailability for any given shard can be modeled as: P = (0.9)^n, where n is the number of peers. Assuming a random distribution of nodes that are participating in EIP-7801 history sharding, for 25 peers, this chance is 7%. For 32 peers, this chance drops to 3.4%. This assumes that a significant number of nodes on the network are serving at least one shard. Adoption by a majority of clients as a default would likely be necessary for a complete sharded history to be available and replicated sufficiently across the network.
As history grows, so will the size of the retained shards on disk, thus raising the storage requirements per node. However, nodes will still benefit from a ~90% storage reduction over the present chain storage requirements, and will scale their future chain storage requirements by only 10% of the rate they would have by retaining all history.
Copyright
This document is CC0-licensed; rights are waived through CC0.
Citation
Please cite this document as:
Ahmad Bitar (@smartprogrammer93) <smartprogrammer@windowslive.com>, Giulio Rebuffo (@Giulio2002), Gary Schulte (@garyschulte) <garyschulte@gmail.com>, "EIP-7801: etha - Sharded Blocks Subprotocol [DRAFT]," Ethereum Improvement Proposals, no. 7801, October 2024. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-7801.