EIP4844: Shard Blob Transactions
Shard Blob Transactions scale dataavailability of Ethereum in a simple, forwardscompatible manner.
Author  Vitalik Buterin, Dankrad Feist, Diederik Loerakker, George Kadianakis, Matt Garnett, Mofi Taiwo 

DiscussionsTo  https://ethereummagicians.org/t/eip4844shardblobtransactions/8430 
Status  Draft 
Type  Standards Track 
Category  Core 
Created  20220225 
Requires  1559, 2718, 2930 
Table of Contents
Abstract
Introduce a new transaction format for “blobcarrying transactions” which contain a large amount of data that cannot be accessed by EVM execution, but whose commitment can be accessed. The format is intended to be fully compatible with the format that will be used in full sharding.
Motivation
Rollups are in the short and medium term, and possibly in the long term, the only trustless scaling solution for Ethereum. Transaction fees on L1 have been very high for months and there is greater urgency in doing anything required to help facilitate an ecosystemwide move to rollups. Rollups are significantly reducing fees for many Ethereum users: Optimism and Arbitrum frequently provide fees that are ~38x lower than the Ethereum base layer itself, and ZK rollups, which have better data compression and can avoid including signatures, have fees ~40100x lower than the base layer.
However, even these fees are too expensive for many users. The longterm solution to the longterm inadequacy of rollups by themselves has always been data sharding, which would add ~16 MB per block of dedicated data space to the chain that rollups could use. However, data sharding will still take a considerable amount of time to finish implementing and deploying.
This EIP provides a stopgap solution until that point by implementing the transaction format that would be used in sharding, but not actually sharding those transactions. Instead, the data from this transaction format is simply part of the beacon chain and is fully downloaded by all consensus nodes (but can be deleted after only a relatively short delay). Compared to full data sharding, this EIP has a reduced cap on the number of these transactions that can be included, corresponding to a target of ~1 MB per block and a limit of ~2 MB.
Specification
Parameters
Constant  Value 

BLOB_TX_TYPE 
Bytes1(0x05) 
FIELD_ELEMENTS_PER_BLOB 
4096 
BLS_MODULUS 
52435875175126190479447740508185965837690552500527637822603658699938581184513 
KZG_SETUP_G2 
Vector[G2Point, FIELD_ELEMENTS_PER_BLOB] , contents TBD 
KZG_SETUP_LAGRANGE 
Vector[KZGCommitment, FIELD_ELEMENTS_PER_BLOB] , contents TBD 
ROOTS_OF_UNITY 
Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB] 
BLOB_COMMITMENT_VERSION_KZG 
Bytes1(0x01) 
POINT_EVALUATION_PRECOMPILE_ADDRESS 
Bytes20(0x14) 
POINT_EVALUATION_PRECOMPILE_GAS 
50000 
MAX_BLOBS_PER_BLOCK 
16 
TARGET_BLOBS_PER_BLOCK 
8 
MAX_BLOBS_PER_TX 
2 
GASPRICE_UPDATE_FRACTION_PER_BLOB 
64 
MAX_VERSIONED_HASHES_LIST_SIZE 
2**24 
MAX_CALLDATA_SIZE 
2**24 
MAX_ACCESS_LIST_SIZE 
2**24 
MAX_ACCESS_LIST_STORAGE_KEYS 
2**24 
MAX_TX_WRAP_KZG_COMMITMENTS 
2**24 
LIMIT_BLOBS_PER_TX 
2**24 
GAS_PER_BLOB 
120000 
HASH_OPCODE_BYTE 
Bytes1(0x49) 
HASH_OPCODE_GAS 
3 
Type aliases
Type  Base type  Additional checks 

BLSFieldElement 
uint256 
x < BLS_MODULUS 
Blob 
Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB] 

VersionedHash 
Bytes32 

KZGCommitment 
Bytes48 
Same as BLS standard “is valid pubkey” check but also allows 0x00..00 for pointatinfinity 
KZGProof 
Bytes48 
Same as for KZGCommitment 
Helpers
Converts a blob to its corresponding KZG point:
def lincomb(points: List[KZGCommitment], scalars: List[BLSFieldElement]) > KZGCommitment:
"""
BLS multiscalar multiplication. This function can be optimized using Pippenger's algorithm and variants.
"""
r = bls.Z1
for x, a in zip(points, scalars):
r = bls.add(r, bls.multiply(x, a))
return r
def blob_to_kzg(blob: Blob) > KZGCommitment:
return lincomb(KZG_SETUP_LAGRANGE, blob)
Converts a KZG point into a versioned hash:
def kzg_to_versioned_hash(kzg: KZGCommitment) > VersionedHash:
return BLOB_COMMITMENT_VERSION_KZG + hash(kzg)[1:]
Verifies a KZG evaluation proof:
def verify_kzg_proof(polynomial_kzg: KZGCommitment,
x: BLSFieldElement,
y: BLSFieldElement,
quotient_kzg: KZGProof) > bool:
# Verify: P  y = Q * (X  x)
X_minus_x = bls.add(KZG_SETUP_G2[1], bls.multiply(bls.G2, BLS_MODULUS  x))
P_minus_y = bls.add(polynomial_kzg, bls.multiply(bls.G1, BLS_MODULUS  y))
return bls.pairing_check([
[P_minus_y, bls.neg(bls.G2)],
[quotient_kzg, X_minus_x]
])
Efficiently evaluates a polynomial in evaluation form using the barycentric formula
def bls_modular_inverse(x: BLSFieldElement) > BLSFieldElement:
"""
Compute the modular inverse of x
i.e. return y such that x * y % BLS_MODULUS == 1 and return 0 for x == 0
"""
return pow(x, 1, BLS_MODULUS) if x != 0 else 0
def div(x, y):
"""Divide two field elements: `x` by `y`"""
return x * bls_modular_inverse(y) % BLS_MODULUS
def evaluate_polynomial_in_evaluation_form(poly: List[BLSFieldElement], x: BLSFieldElement) > BLSFieldElement:
"""
Evaluate a polynomial (in evaluation form) at an arbitrary point `x`
Uses the barycentric formula:
f(x) = (1  x**WIDTH) / WIDTH * sum_(i=0)^WIDTH (f(DOMAIN[i]) * DOMAIN[i]) / (x  DOMAIN[i])
"""
width = len(poly)
assert width == FIELD_ELEMENTS_PER_BLOB
inverse_width = bls_modular_inverse(width)
for i in range(width):
r += div(poly[i] * ROOTS_OF_UNITY[i], (x  ROOTS_OF_UNITY[i]) )
r = r * (pow(x, width, BLS_MODULUS)  1) * inverse_width % BLS_MODULUS
return r
Approximates 2 ** (numerator / denominator)
, with the simplest possible approximation that is continuous and has a continuous derivative:
def fake_exponential(numerator: int, denominator: int) > int:
cofactor = 2 ** (numerator // denominator)
fractional = numerator % denominator
return cofactor + (
fractional * cofactor * 2 +
(fractional ** 2 * cofactor) // denominator
) // (denominator * 3)
New transaction type
We introduce a new EIP2718 transaction type,
with the format being the single byte BLOB_TX_TYPE
followed by an SSZ encoding of the
SignedBlobTransaction
container comprising the transaction contents:
class SignedBlobTransaction(Container):
message: BlobTransaction
signature: ECDSASignature
class BlobTransaction(Container):
chain_id: uint256
nonce: uint64
priority_fee_per_gas: uint256
max_basefee_per_gas: uint256
gas: uint64
to: Union[None, Address] # Address = Bytes20
value: uint256
data: ByteList[MAX_CALLDATA_SIZE]
access_list: List[AccessTuple, MAX_ACCESS_LIST_SIZE]
blob_versioned_hashes: List[VersionedHash, MAX_VERSIONED_HASHES_LIST_SIZE]
class AccessTuple(Container):
address: Address # Bytes20
storage_keys: List[Hash, MAX_ACCESS_LIST_STORAGE_KEYS]
class ECDSASignature(Container):
y_parity: boolean
r: uint256
s: uint256
The priority_fee_per_gas
and max_basefee_per_gas
fields follow EIP1559 semantics,
and access_list
as in EIP2930
.
EIP2718
is extended with a “wrapper data”, the typed transaction can be encoded in two forms, dependent on the context:
 Network (default):
TransactionType  TransactionNetworkPayload
, orLegacyTransaction
 Minimal (as in execution payload):
TransactionType  TransactionPayload
, orLegacyTransaction
Executionpayloads / blocks use the minimal encoding of transactions. In the transactionpool and local transactionjournal the network encoding is used.
For previous types of transactions the network encoding is no different, i.e. TransactionNetworkPayload == TransactionPayload
.
The TransactionNetworkPayload
wraps a TransactionPayload
with additional data:
this wrapping data SHOULD be verified directly before or after signature verification.
When a blob transaction is passed through the network (see the Networking section below),
the TransactionNetworkPayload
version of the transaction also includes blobs
and kzgs
(commitments list).
The execution layer verifies the wrapper validity against the inner TransactionPayload
after signature verification as:
 All hashes in
blob_versioned_hashes
must start with the byteBLOB_COMMITMENT_VERSION_KZG
 There may be at most
MAX_BLOBS_PER_TX
blob commitments in any single transaction.  There may be at most
MAX_BLOBS_PER_BLOCK
total blob commitments in a valid block.  There is an equal amount of versioned hashes, kzg commitments and blobs.
 The KZG commitments hash to the versioned hashes, i.e.
kzg_to_versioned_hash(kzg[i]) == versioned_hash[i]
 The KZG commitments match the blob contents. (Note: this can be optimized with additional data, using a proof for a random evaluation at two points derived from the commitment and blob data)
The signature is verified and tx.origin
is calculated as follows:
def tx_hash(tx: SignedBlobTransaction) > Bytes32:
# The preimage is prefixed with the transactiontype to avoid hash collisions with other tx hashers and types
return keccak256(BLOB_TX_TYPE + ssz.hash_tree_root(tx.message))
def get_origin(tx: SignedBlobTransaction) > Address:
sig = tx.signature
# v = int(y_parity) + 27, same as EIP1559
return ecrecover(tx_hash(tx), int(sig.y_parity)+27, sig.r, sig.s)
Header extension
The current header encoding is extended with a new 256bit unsigned integer field excess_blobs
. This is the running
total of excess blobs included on chain since this EIP was activated. If the total number of blobs is below the
average, excess_blobs
is capped at zero.
The resulting RLP encoding of the header is therefore:
rlp([
parent_hash,
ommers_hash,
coinbase,
state_root,
tx_root,
receipt_root,
bloom,
difficulty,
number,
gas_limit,
gas_used,
time,
extra,
mix_digest,
nonce,
base_fee,
excess_blobs
])
The value of excess_blobs
can be calculated using the parent header and number of blobs in the block.
def calc_excess_blobs(parent: Header, new_blobs: int) > int:
if parent.excess_blobs + new_blobs < TARGET_BLOBS_PER_BLOCK:
return 0
else:
return parent.excess_blobs + new_blobs  TARGET_BLOBS_PER_BLOCK
Beacon chain validation
On the consensuslayer the blobs are now referenced, but not fully encoded, in the beacon block body. Instead of embedding the full contents in the body, the contents of the blobs are propagated separately, as a “sidecar”.
This “sidecar” design provides forward compatibility for further data increases by blackboxing is_data_available()
:
with full sharding is_data_available()
can be replaced by dataavailabilitysampling (DAS) thus avoiding all blobs being downloaded by all beacon nodes on the network.
Note that the consensuslayer is tasked with persisting the blobs for data availability, the executionlayer is not.
The ethereum/consensusspecs
repository defines the following beaconnode changes involved in this EIP:
 Beacon chain: process updated beacon blocks and ensure blobs are available.
 P2P network: gossip and sync updated beacon block types and new blobs sidecars.
 Honest validator: produce beacon blocks with blobs, publish the blobs sidecars.
Opcode to get versioned hashes
We add an opcode DATAHASH
(with byte value HASH_OPCODE_BYTE
) which takes as input one stack argument index
,
and returns tx.message.blob_versioned_hashes[index]
if index < len(tx.message.blob_versioned_hashes)
,
and otherwise zero.
The opcode has a gas cost of HASH_OPCODE_GAS
.
Point evaluation precompile
Add a precompile at POINT_EVALUATION_PRECOMPILE_ADDRESS
that evaluates a proof that a particular blob resolves
to a particular value at a point.
The precompile costs POINT_EVALUATION_PRECOMPILE_GAS
and executes the following logic:
def point_evaluation_precompile(input: Bytes) > Bytes:
# Verify P(z) = a
# versioned hash: first 32 bytes
versioned_hash = input[:32]
# Evaluation point: next 32 bytes
x = int.from_bytes(input[32:64], 'little')
assert x < BLS_MODULUS
# Expected output: next 32 bytes
y = int.from_bytes(input[64:96], 'little')
assert y < BLS_MODULUS
# The remaining data will always be the proof, including in future versions
# input kzg point: next 48 bytes
data_kzg = input[96:144]
assert kzg_to_versioned_hash(data_kzg) == versioned_hash
# Quotient kzg: next 48 bytes
quotient_kzg = input[144:192]
assert verify_kzg_proof(data_kzg, x, y, quotient_kzg)
return Bytes([])
Gas price of blobs (Simplified version)
For early draft implementations, we simply change get_blob_gas(parent)
to always return GAS_PER_BLOB
.
Gas price update rule (Full version)
We propose a simple independent EIP1559style targeting rule to compute the gas cost of the transaction.
We use the excess_blobs
header field to store persistent data needed to compute the cost.
def get_intrinsic_gas(tx: SignedBlobTransaction, parent: Header) > int:
intrinsic_gas = 21000 # G_transaction
if tx.message.to == None: # i.e. if a contract is created
intrinsic_gas = 53000
# EIP2028 data gas cost reduction for zero bytes
intrinsic_gas += 16 * len(tx.message.data)  12 * len(tx.message.data.count(0))
# EIP2930 Optional access lists
intrinsic_gas += 1900 * sum(len(entry.storage_keys) for entry in tx.message.access_list) + 2400 * len(tx.message.access_list)
# New additional gas cost per blob
intrinsic_gas += len(tx.message.blob_versioned_hashes) * get_blob_gas(parent)
return intrinsic_gas
def get_blob_gas(header: Header) > int:
return fake_exponential(
header.excess_blobs,
GASPRICE_UPDATE_FRACTION_PER_BLOB
)
Networking
Transactions are presented as TransactionType  TransactionNetworkPayload
on the execution layer network,
the payload is a SSZ encoded container:
class BlobTransactionNetworkWrapper(Container):
tx: SignedBlobTransaction
# KZGCommitment = Bytes48
blob_kzgs: List[KZGCommitment, MAX_TX_WRAP_KZG_COMMITMENTS]
# BLSFieldElement = uint256
blobs: List[Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB], LIMIT_BLOBS_PER_TX]
# KZGProof = Bytes48
kzg_aggregated_proof: KZGProof
We do networklevel validation of BlobTransactionNetworkWrapper
objects as follows:
def hash_to_bls_field(x: Container) > BLSFieldElement:
"""
This function is used to generate FiatShamir challenges. The output is not uniform over the BLS field.
"""
return int.from_bytes(hash_tree_root(x), "little") % BLS_MODULUS
def compute_powers(x: BLSFieldElement, n: uint64) > List[BLSFieldElement]:
current_power = 1
powers = []
for _ in range(n):
powers.append(BLSFieldElement(current_power))
current_power = current_power * int(x) % BLS_MODULUS
return powers
def vector_lincomb(vectors: List[List[BLSFieldElement]], scalars: List[BLSFieldElement]) > List[BLSFieldElement]:
"""
Given a list of vectors, compute the linear combination of each column with `scalars`, and return the resulting
vector.
"""
r = [0]*len(vectors[0])
for v, a in zip(vectors, scalars):
for i, x in enumerate(v):
r[i] = (r[i] + a * x) % BLS_MODULUS
return [BLSFieldElement(x) for x in r]
def validate_blob_transaction_wrapper(wrapper: BlobTransactionNetworkWrapper):
versioned_hashes = wrapper.tx.message.blob_versioned_hashes
commitments = wrapper.blob_kzgs
blobs = wrapper.blobs
# note: assert blobs are not malformatted
assert len(versioned_hashes) == len(commitments) == len(blobs)
number_of_blobs = len(blobs)
# Generate random linear combination challenges
r = hash_to_bls_field([blobs, commitments])
r_powers = compute_powers(r, number_of_blobs)
# Compute commitment to aggregated polynomial
aggregated_poly_commitment = lincomb(commitments, r_powers)
# Create aggregated polynomial in evaluation form
aggregated_poly = vector_lincomb(blobs, r_powers)
# Generate challenge `x` and evaluate the aggregated polynomial at `x`
x = hash_to_bls_field([aggregated_poly, aggregated_poly_commitment])
y = evaluate_polynomial_in_evaluation_form(aggregated_poly, x)
# Verify aggregated proof
assert verify_kzg_proof(aggregated_poly_commitment, x, y, wrapper.kzg_aggregated_proof)
# Now that all commitments have been verified, check that versioned_hashes matches the commitments
for versioned_hash, commitment in zip(versioned_hashes, commitments):
assert versioned_hash == kzg_to_versioned_hash(commitment)
Rationale
On the path to sharding
This EIP introduces blob transactions in the same format in which they are expected to exist in the final sharding specification. This provides a temporary but significant scaling relief for rollups by allowing them to scale to 2 MB per slot, with a separate fee market allowing fees to be very low while usage of this system is limited.
The core goal of rollup scaling stopgaps is to provide temporary scaling relief, without imposing extra development burdens on rollups to take advantage of this relief. Today, rollups use calldata. In the future, rollups will have no choice but to use sharded data (also called “blobs”) because sharded data will be much cheaper. Hence, rollups cannot avoid making a large upgrade to how they process data at least once along the way. But what we can do is ensure that rollups need to only upgrade once. This immediately implies that there are exactly two possibilities for a stopgap: (i) reducing the gas costs of existing calldata, and (ii) bringing forward the format that will be used for sharded data, but not yet actually sharding it. Previous EIPs were all a solution of category (i); this EIP is a solution of category (ii).
The main tradeoff in designing this EIP is that of implementing more now versus having to implement more later: do we implement 25% of the work on the way to full sharding, or 50%, or 75%?
The work that is already done in this EIP includes:
 A new transaction type, of the exact same format that will need to exist in “full sharding”
 All of the executionlayer logic required for full sharding
 All of the execution / consensus crossverification logic required for full sharding
 Layer separation between
BeaconBlock
verification and data availability sampling blobs  Most of the
BeaconBlock
logic required for full sharding  A selfadjusting independent gasprice for blobs.
The work that remains to be done to get to full sharding includes:
 A lowdegree extension of the
blob_kzgs
in the consensus layer to allow 2D sampling  An actual implementation of data availability sampling
 PBS (proposer/builder separation), to avoid requiring individual validators to process 32 MB of data in one slot
 Proof of custody or similar inprotocol requirement for each validator to verify a particular part of the sharded data in each block
This EIP also sets the stage for longerterm protocol cleanups:
 It adds an SSZ transaction type which is slightly gasadvantaged (1000 discount) to nudge people toward using it, and paves the precedent that all new transaction types should be SSZ
 It defines
TransactionNetworkPayload
to separate network and block encodings of a transaction type  Its (cleaner) gas price update rule could be applied to the primary basefee.
How rollups would function
Instead of putting rollup block data in transaction calldata, rollups would expect rollup block submitters to put the data into blobs. This guarantees availability (which is what rollups need) but would be much cheaper than calldata. Rollups need data to be available once, long enough to ensure honest actors can construct the rollup state, but not forever.
Optimistic rollups only need to actually provide the underlying data when fraud proofs are being submitted. The fraud proof can verify the transition in smaller steps, loading at most a few values of the blob at a time through calldata. For each value it would provide a KZG proof and use the point evaluation precompile to verify the value against the versioned hash that was submitted before, and then perform the fraud proof verification on that data as is done today.
ZK rollups would provide two commitments to their transaction or state delta data: the kzg in the blob and some commitment using whatever proof system the ZK rollup uses internally. They would use a commitment proof of equivalence protocol, using the point evaluation precompile, to prove that the kzg (which the protocol ensures points to available data) and the ZK rollup’s own commitment refer to the same data.
Versioned hashes
We use versioned hashes (rather than kzgs) as references to blobs in the execution layer to ensure forward compatibility with future changes. For example, if we need to switch to Merkle trees + STARKs for quantumsafety reasons, then we would add a new version, allowing the point verification precompile to work with the new format. Rollups would not have to make any EVMlevel changes to how they work; sequencers would simply have to switch over to using a new transaction type at the appropriate time.
Blob gasprice update rule
The blob gasprice update rule is intended to approximate the formula blob_gas = 2**(excess_blobs / GASPRICE_UPDATE_FRACTION_PER_BLOB)
,
where excess_blobs
is the total “extra” number of blobs that the chain has accepted relative to the “targeted” number (TARGET_BLOBS_PER_BLOCK
per block).
Like EIP1559, it’s a selfcorrecting formula: as the excess goes higher, the blob_gas
increases exponentially, reducing usage and eventually forcing the excess back down.
The blockbyblock behavior is roughly as follows.
If in block N
, blob_gas = G1
, and block N
has X
blobs, then in block N+1
, excess_blobs
increases by X  TARGET_BLOBS_PER_BLOCK
,
and so the blob_gas
of block N+1
increases by a factor of 2**((X  TARGET_BLOBS_PER_BLOCK) / GASPRICE_UPDATE_FRACTION_PER_BLOB)
.
Hence, it has a similar effect to the existing EIP1559, but is more “stable” in the sense that it responds in the same way to the same total usage regardless of how it’s distributed.
Backwards Compatibility
Blob nonaccessibility
This EIP introduces a transaction type that has a distinct mempool version (BlobTransactionNetworkWrapper
) and executionpayload version (SignedBlobTransaction
),
with only oneway convertibility between the two. The blobs are in the BlobTransactionNetworkWrapper
and not in the SignedBlobTransaction
;
instead, they go into the BeaconBlockBody
. This means that there is now a part of a transaction that will not be accessible from the web3 API.
Mempool issues
Blob transactions are unique in that they have a variable intrinsic gas cost. Hence, a transaction that could be included in one block may be invalid for the next.
To prevent mempool attacks, we recommend a simple technique: only propagate transactions whose gas
is at least twice the current minimum.
Additionally, blob transactions have a large data size at the mempool layer, which poses a mempool DoS risk,
though not an unprecedented one as this also applies to transactions with large amounts of calldata.
The risk is that an attacker makes and publishes a series of large blob transactions with fees f9 > f8 > ... > f1
,
where each fee is the 10% minimum increment higher than the previous, and finishes it off with a 21000gas basic transaction with fee f10
.
Hence, an attacker could impose millions of gas worth of load on the network and only pay 21000 gas worth of fees.
We recommend a simple solution: both for blob transactions and for transactions carrying a large amount of calldata, increase the minimum increment for mempool replacement from 1.1x to 2x, decreasing the number of resubmissions an attacker can do at any given fee level by ~7x.
Test Cases
TBD
Security Considerations
This EIP increases the storage requirements per Beacon block by a maximum of ~2 MB. This is equal to the theoretical maximum size of a block today (30M gas / 16 gas per calldata byte = 1.875M bytes), and so it will not greatly increase worstcase bandwidth. Postmerge, block times are expected to be static rather than an unpredictable Poisson distribution, giving a guaranteed period of time for large blocks to propagate.
The sustained load of this EIP is much lower than alternatives that reduce calldata costs, even if the calldata is limited, because there is no existing software that stores the blobs indefinitely and there is no expectation that they need to be stored for as long as an execution payload. This makes it easier to implement a policy that these blobs should be deleted after e.g. 3060 days, a much shorter delay compared to proposed (but yet to be implemented) oneyear rotation times for execution payload history.
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Vitalik Buterin, Dankrad Feist, Diederik Loerakker, George Kadianakis, Matt Garnett, Mofi Taiwo, "EIP4844: Shard Blob Transactions [DRAFT]," Ethereum Improvement Proposals, no. 4844, February 2022. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip4844.