Alert Source Discuss
⚠️ Review Standards Track: Core

EIP-7620: EOF Contract Creation

Introduce `EOFCREATE`, `TXCREATE`, `RETURNCONTRACT` instructions along with a new `InitcodeTransaction` transaction

Authors Alex Beregszaszi (@axic), Paweł Bylica (@chfast), Andrei Maiboroda (@gumb0), Piotr Dobaczewski (@pdobacz)
Created 2024-02-12
Requires EIP-170, EIP-1559, EIP-2028, EIP-2718, EIP-3540, EIP-3541, EIP-3670, EIP-3860

Abstract

EVM Object Format (EOF) removes the possibility to create contracts using creation transactions (with an empty to field), CREATE or CREATE2 instructions. We introduce three new instructions: EOFCREATE, TXCREATE and RETURNCONTRACT, as well as a new transaction type (InitcodeTransaction) to provide a way to create contracts using EOF containers.

Motivation

This EIP uses terminology from the EIP-3540 which introduces the EOF format.

EOF aims to remove code observability, which is a prerequisite to legacy EVM contract creation logic using create transactions, CREATE or CREATE2, because both the initcode and code are available to the EVM and can be manipulated. On the same premise, EOF removes opcodes like CODECOPY and EXTCODECOPY, introducing EOF subcontainers as a replacement to cater for factory contracts creating other contracts.

The new instructions and the new transaction type introduced in this EIP operate on EOF containers enabling all use cases of contract creation that legacy EVM has.

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Parameters

Constant Value
INITCODE_TX_TYPE Bytes1(0x04)
MAX_INITCODE_COUNT 256
GAS_KECCAK256_WORD Defined as 6 in the Ethereum Execution Layer Specs
TX_CREATE_COST Defined as 32000 in the Ethereum Execution Layer Specs
STACK_DEPTH_LIMIT Defined as 1024 in the Ethereum Execution Layer Specs
GAS_CODE_DEPOSIT Defined as 200 in the Ethereum Execution Layer Specs
TX_DATA_COST_PER_ZERO Defined as 4 in the Ethereum Execution Layer Specs
TX_DATA_COST_PER_NON_ZERO Defined as 16 in the Ethereum Execution Layer Specs
INITCODE_WORD_COST Defined as 2 in EIP-3860
MAX_INITCODE_SIZE Defined as 2 * MAX_CODE_SIZE in EIP-3860
MAX_CODE_SIZE Defined as 24576 in EIP-170

We introduce three new instructions on the same block number EIP-3540 is activated on:

  1. EOFCREATE (0xec)
  2. TXCREATE (0xed)
  3. RETURNCONTRACT (0xee)

Transaction Types

Introduce new transaction InitcodeTransaction (type INITCODE_TX_TYPE) which extends EIP-1559 (type 2) transaction by adding a new field initcodes: List[ByteList[MAX_INITCODE_SIZE], MAX_INITCODE_COUNT].

The initcodes can only be accessed via the TXCREATE instruction (see below), therefore InitcodeTransactions are intended to be sent to contracts including TXCREATE in their execution.

We introduce a standardised Creator Contract (i.e. written in EVM, but existing at a known address, such as precompiles), which eliminates the need to have create transactions with empty to. Deployment of the Creator Contract will require an irregular state change at EOF activation block. Note that such introduction of the Creator Contract is needed, because only EOF contracts can create EOF contracts. See below for Creator Contract code.

Gas schedule

Each initcodes item data costs the same as calldata (TX_DATA_COST_PER_NON_ZERO gas for non-zero bytes, TX_DATA_COST_PER_ZERO for zero bytes – see EIP-2028). The intrinsic gas of an InitcodeTransaction is extended by the sum of all those items’ costs. Using the conventions from calculate_intrinsic_cost in Ethereum Execution Layer Specs, the additional cost is calculated as:

initcode_cost = 0
    for initcode in tx.initcodes:
        for byte in initcode:
            if byte == 0:
                initcode_cost += TX_DATA_COST_PER_ZERO
            else:
                initcode_cost += TX_DATA_COST_PER_NON_ZERO

Transaction validation

  • InitcodeTransaction is invalid if there are zero entries in initcodes, or if there are more than MAX_INITCODE_COUNT entries.
  • InitcodeTransaction is invalid if any entry in initcodes is zero length, or if any entry exceeds MAX_INITCODE_SIZE.
  • InitcodeTransaction is invalid if the to is nil.

Under transaction validation rules initcodes are not validated for conforming to the EOF specification. They are only validated when accessed via TXCREATE. This avoids potential DoS attacks of the mempool. If during the execution of an InitcodeTransaction no TXCREATE instruction is called, such transaction is still valid.

RLP and signature

Given the definitions from EIP-2718 the TransactionPayload for an InitcodeTransaction is the RLP serialization of:

[chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes, y_parity, r, s]

TransactionType is INITCODE_TX_TYPE and the signature values y_parity, r, and s are calculated by constructing a secp256k1 signature over the following digest:

keccak256(INITCODE_TX_TYPE || rlp([chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, to, value, data, access_list, initcodes]))

The EIP-2718 ReceiptPayload for this transaction is rlp([status, cumulative_transaction_gas_used, logs_bloom, logs]).

Execution Semantics

  • The instructions CREATE, CREATE2 are made obsolete and rejected by validation in EOF contracts. They are only available in legacy contracts.
  • Legacy creation transactions (any transactions with empty to) are invalid in case data contains EOF code (starts with EF00 magic)
  • If instructions CREATE and CREATE2 have EOF code as initcode (starting with EF00 magic)
    • deployment fails (returns 0 on the stack)
    • caller’s nonce is not updated and gas for initcode execution is not consumed

Overview of the new contract creation flow

In EOF EVM, new bytecode is delivered inside a special field in an InitcodeTransaction in the form of EOF containers. Such EOF containers may contain arbitrarily deeply nesting subcontainers. A target contract of an InitcodeTransaction may execute TXCREATE instruction(s), and each execution refers to one such EOF container (the initcontainer). The initcontainer and its subcontainers are recursively validated according to all the validation rules applicable for the EOF version in question. Next, the 0th code section of the initcontainer is executed and may eventually call a RETURNCONTRACT instruction, which will refer to a subcontainer to be finally deployed to an address.

As such, InitcodeTransaction and TXCREATE are an EOF replacement of a legacy create transaction.

EOFCREATE instruction is in turn a replacement of the CREATE and CREATE2 legacy instructions allowing factory contracts to create other contracts. The main difference to TXCREATE is that the initcontainer is selected to be one of the subcontainers of the EOF container calling EOFCREATE. It is worth noting that no validation is performed at this point, as it has already been done when the factory contract containing EOFCREATE was deployed.

Details on each instruction follow in the next sections.

EOFCREATE

  • deduct TX_CREATE_COST gas
  • read immediate operand initcontainer_index, encoded as 8-bit unsigned value
  • pop value, salt, input_offset, input_size from the operand stack
  • peform (and charge for) memory expansion using [input_offset, input_size]
  • load initcode EOF subcontainer at initcontainer_index in the container from which EOFCREATE is executed
    • let initcontainer_size be the declared size of that EOF subcontainer in its parent container header
  • deduct GAS_KECCAK256_WORD * ((initcontainer_size + 31) // 32) gas (hashing charge)
  • check that current call depth is below STACK_DEPTH_LIMIT and that caller balance is enough to transfer value
    • in case of failure return 0 on the stack, caller’s nonce is not updated and gas for initcode execution is not consumed.
  • follow steps in the Initcontainer execution below
  • deduct GAS_CODE_DEPOSIT * deployed_code_size gas

TXCREATE

  • deduct TX_CREATE_COST gas
  • pop tx_initcode_hash, value, salt, input_offset, input_size from the operand stack
  • peform (and charge for) memory expansion using [input_offset, input_size]
  • load initcode EOF container from the transaction initcodes array which hashes to tx_initcode_hash
    • fails (returns 0 on the stack) if such initcode does not exist in the transaction, or if called from a transaction of TransactionType other than INITCODE_TX_TYPE
      • caller’s nonce is not updated and gas for initcode execution is not consumed. Only TXCREATE constant gas was consumed
    • let initcontainer_size be the length of that EOF container in bytes
  • deduct INITCODE_WORD_COST * ((initcontainer_size + 31) // 32) gas
  • deduct GAS_KECCAK256_WORD * ((initcontainer_size + 31) // 32) gas (hashing charge)
  • check that current call depth is below STACK_DEPTH_LIMIT and that caller balance is enough to transfer value
  • validate the initcode container and all its subcontainers recursively
  • in addition to this, check if the initcode container has its len(data_section) equal data_size, i.e. data section content is exactly as the size declared in the header (see Data section lifecycle)
  • fails (returns 0 on the stack) if any of the checks above was invalid
    • caller’s nonce is not updated and gas for initcode execution is not consumed. Only TX_CREATE_COST constant, EIP-3860 gas and hashing gas were consumed
  • follow steps in the Initcontainer execution below
  • deduct GAS_CODE_DEPOSIT * deployed_code_size gas

Initcontainer execution

These steps are common for EOFCREATE and TXCREATE:

  • caller’s memory slice [input_offset:input_size] is used as calldata
  • execute the container in “initcode-mode” and deduct gas for execution. The 63/64th rule from EIP-150 applies.
  • increment sender account’s nonce
  • calculate new_address as keccak256(0xff || sender || salt || keccak256(initcontainer))[12:]
  • an unsuccesful execution of initcode results in pushing 0 onto the stack
    • can populate returndata if execution REVERTed
  • a successful execution ends with initcode executing RETURNCONTRACT{deploy_container_index}(aux_data_offset, aux_data_size) instruction (see below). After that:
    • load deploy EOF subcontainer at deploy_container_index in the container from which RETURNCONTRACT is executed
    • concatenate data section with (aux_data_offset, aux_data_offset + aux_data_size) memory segment and update data size in the header
    • if updated deploy container size exceeds MAX_CODE_SIZE instruction exceptionally aborts
    • set state[new_address].code to the updated deploy container
    • push new_address onto the stack
  • RETURN and STOP are not allowed in “initcode-mode” (abort execution)
  • “initcode-mode” is not transitive - it is only active for the frame executing the initcontainer. If another (non-create) call is made from this frame, it is not executed in “initcode-mode”.

RETURNCONTRACT

  • read immediate operand deploy_container_index, encoded as 8-bit unsigned value
  • pop two values from the operand stack: aux_data_offset, aux_data_size referring to memory section that will be appended to deployed container’s data
  • cost 0 gas + possible memory expansion for aux data
  • ends initcode frame execution and returns control to EOFCREATE/4 caller frame where deploy_container_index and aux_data are used to construct deployed contract (see above)
  • instruction exceptionally aborts if after the appending, data section size would overflow the maximum data section size or underflow (i.e. be less than data section size declared in the header)
  • instruction exceptionally aborts if invoked not in “initcode-mode”

Code Validation

We extend code section validation rules (as defined in EIP-3670).

  1. EOFCREATE initcontainer_index must be less than num_container_sections
  2. EOFCREATE the subcontainer pointed to by initcontainer_index must have its len(data_section) equal data_size, i.e. data section content is exactly as the size declared in the header (see Data section lifecycle)
  3. RETURNCONTRACT deploy_container_index must be less than num_container_sections
  4. RJUMP, RJUMPI and RJUMPV immediate argument value (jump destination relative offset) validation: code section is invalid in case offset points to the byte directly following either EOFCREATE or RETURNCONTRACT instruction.

Data Section Lifecycle

For an EOF container which has not yet been deployed, the data_section is only a portion of the final data_section after deployment. Let’s define it as pre_deploy_data_section and as pre_deploy_data_size the data_size declared in that container’s header. pre_deploy_data_size >= len(pre_deploy_data_section), which anticipates more data to be appended to the pre_deploy_data_section during the process of deploying.

pre_deploy_data_section
|                                      |
\___________pre_deploy_data_size______/

For a deployed EOF container, the final data_section becomes:

pre_deploy_data_section | static_aux_data | dynamic_aux_data
|                         |             |                  |
|                          \___________aux_data___________/
|                                       |                  |
\___________pre_deploy_data_size______/                    |
|                                                          |
\________________________data_size_______________________/

where:

  • aux_data is the data which is appended to pre_deploy_data_section on RETURNCONTRACT instruction see Initcontainer execution.
  • static_aux_data is a subrange of aux_data, which size is known before RETURNCONTRACT and equals pre_deploy_data_size - len(pre_deploy_data_section).
  • dynamic_aux_data is the remainder of aux_data.

data_size in the deployed container header is updated to be equal len(data_section).

Summarizing, there are pre_deploy_data_size bytes in the final data section which are guaranteed to exist before the EOF container is deployed and len(dynamic_aux_data) bytes which are known to exist only after. This impacts the validation and behavior of data-section-accessing instructions: DATALOAD, DATALOADN, and DATACOPY, see EIP-7480.

Creator Contract

{
/// Takes [index][salt][init_data] as input,
/// creates contract and returns the address or failure otherwise

/// init_data.length can be 0, but the first 2 words are mandatory
let size := calldatasize()
if lt(size, 64) { revert(0, 0) }

let tx_initcode_index := calldataload(0)
let salt := calldataload(32)

let init_data_size := sub(size, 64)
calldatacopy(0, 64, init_data_size)

let ret := txcreate(tx_initcode_index, callvalue(), salt, 0, init_data_size)
if iszero(ret) { revert(0, 0) }

mstore(0, ret)
return(0, 32)

// Helper to compile this with existing Solidity (with --strict-assembly mode)
function txcreate(a, b, c, d, e) -> f {
    f := verbatim_5i_1o(hex"ed", a, b, c, d, e)
}
    
}

Rationale

Data section appending

The data section is appended to during contract creation and also its size needs to be updated in the header. Alternative designs were considered, where:

  • additional section kinds for the data were introduced
  • additional fields describing a subcontainer were introduced
  • data section would be written over as opposed to being appended to, requiring it to be filled with 0 bytes prior to deployment

All of these alternatives either complicated the otherwise simple data structures or took away useful features (like the dynamically sized portion of the data section).

TXCREATE

An alternative design was discussed, where instead of TXCREATE, InitcodeTransaction and the Creator Contract, the legacy-like creation transactions (with empty to) continue to be available to send EOF initcontainers in transactions and perform deployment of EOF contracts.

This makes it impossible for smart contract wallets to deploy arbitrary contracts (only EOAs can), which is a desired feature of current legacy creation rules. A workaround where those arbitrary contracts are first “uploaded” to a factory contract, and then deployed using EOFCREATE, is not satisfactory.

TXCREATE failure modes

TXCREATE has two “light” failure modes in case the initcontainer is not present and in case the EOF validation is unsuccessful. An alternative design where both cases led to a “hard” failure (consuming the entire gas available) was considered. We decided to have the more granular and forgiving failure modes in order to align the gas costs incurred to the actual work the EVM performs.

Creator Contract

EOF contract creation requires the Creator Contract be introduced via a state change, because neither legacy contracts nor create transactions can deploy EOF code. The alternative approach which was to continue using legacy creation would still rely on fetching the initcode from memory and not satisfy the requirement of code non-observability.

EOF validation checking initcode-mode

Extra EOF validation rules were considered:

  • Each EOF subcontainer must either be referenced by an EOFCREATE or a RETURNCONTRACT instruction, but never both
  • Each EOF subcontainer may either contain RETURNCONTRACT or RETURN / STOP instructions, but never a mix of these two kinds
  • A subcontainer referenced by an EOFCREATE must never contain a RETURN / STOP instruction, whereas a subcontainer referenced by a RETURNCONTRACT must never contain a RETURNCONTRACT

These rules were not adopted to avoid the extra logic and complexity (esp. the last rule causes EOF validation of subcontainers to not be self-contained).

Backwards Compatibility

This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The new instructions are not introduced for legacy bytecode (code which is not EOF formatted), and the contract creation options do not change for legacy bytecode.

Legacy create transactions with data starting with EF00 now are invalid. Similarly CREATE and CREATE2 calls with EF00 initcode fail early without executing the initcode. Previously, in both cases the initcode execution would begin and fail on the first undefined instruction EF.

Test Cases

Creation transaction, CREATE and CREATE2 cannot have its code starting with 0xEF, but such cases are covered already in EIP-3541. However, new cases must be added where creation transaction, CREATE or CREATE2 have its initcode being (validly or invalidly) EOF formatted:

Initcode Expected result
0xEF initcode starts execution and fails
0xEF01 initcode starts execution and fails
0xEF5f initcode starts execution and fails
0xEF00 invalid creation transaction (or CREATE / CREATE2 fails early, returns 0 and keeps sender nonce intact)
0xEF0001 as above
valid EOFv1 container as above

Since EOF contract validation happens for all EOF containers during TXCREATE, the following cases must be tested:

  • TXCREATE references a valid EOF initcontainer having only valid subcontainers
  • TXCREATE references legacy code, contract creation fails
  • TXCREATE references an invalid EOF initcontainer, contract creation fails
  • TXCREATE references a valid EOF initcontainer having an invalid EOF subcontainer somewhere in the subcontainer tree, contract creation fails
  • TXCREATE references a valid EOF initcontainer having legacy code somewhere in the subcontainer tree, contract creation fails

Cases for initcode calling “nested” EOFCREATE or TXCREATE in various combinations

Security Considerations

TXCREATE needs a detailed review and discussion as that is where external unverified code enters the state. Among others:

  1. Is its complexity under control, ruling out any DoS attempts
  2. Is it correctly priced and always charged for
  3. Is the validation comprehensive and not allowing problematic code to be saved into the state

Copyright and related rights waived via CC0.

Citation

Please cite this document as:

Alex Beregszaszi (@axic), Paweł Bylica (@chfast), Andrei Maiboroda (@gumb0), Piotr Dobaczewski (@pdobacz), "EIP-7620: EOF Contract Creation [DRAFT]," Ethereum Improvement Proposals, no. 7620, February 2024. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-7620.