Introduce account versioning for smart contracts so upgrading the VM
or introducing new VMs can be easier.
Abstract
This defines a method of hard forking while maintaining the exact
functionality of existing account by allowing multiple versions of the
virtual machines to execute in the same block. This is also useful to
define future account state structures when we introduce the on-chain
WebAssembly virtual machine.
Motivation
By allowing account versioning, we can execute different virtual
machine for contracts created at different times. This allows breaking
features to be implemented while making sure existing contracts work
as expected.
Note that this specification might not apply to all hard forks. We
have emergency hard forks in the past due to network attacks. Whether
they should maintain existing account compatibility should be
evaluated in individual basis. If the attack can only be executed once
against some particular contracts, then the scheme defined here might
still be applicable. Otherwise, having a plain emergency hard fork
might still be a good idea.
Specification
Account State
Re-define account state stored in the world state trie to have 5
items: nonce, balance, storageRoot, codeHash, and
version. The newly added field version is a 256-bit scalar. We
use the definition of “scalar” from Yellow Paper. Note that this is
the same type as nonce and balance, and it is equivalent to a RLP
variable-sized byte array with no leading zero, of maximum length 32.
When version is zero, the account is RLP-encoded with the first 4
items. When version is not zero, the account is RLP-encoded with 5
items.
Account versions can also optionally define additional account state
RLP fields, whose meaning are specified through its version
field. In those cases, the parsing strategy is defined in “Additional
Fields in Account State RLP” section.
Contract Execution
When fetching an account code from state, we always fetch the
associated version field together. We refer to this as the code’s
version below. The code of the account is always executed in the
code’s version.
In particular, this means that for DELEGATECALL and CALLCODE, the
version of the execution call frame is the same as
delegating/receiving contract’s version.
Contract Deployment
In Ethereum, a contract has a deployment method, either by a contract
creation transaction, or by another contract. If we regard this
deployment method a contract’s parent, then we find them forming a
family of contracts, with the root being a contract creation
transaction.
We let a family of contracts to always have the same version. That
is, CREATE and CREATE2 will always deploy contract that has the
same version as the code’s version.
In other words, CREATE and CREATE2 will execute the init code
using the current code’s version, and deploy the contract of the
current code’s version. This holds even if the to-be-deployed code
is empty.
Validation
A new phrase, validation is added to contract deployment (by
CREATE / CREATE2 opcodes, or by contract creation
transaction). When version is 0, the phrase does nothing and
always succeeds. Future VM versions can define additional validation
that has to be passed.
If the validation phrase fails, deployment does not proceed and return
out-of-gas.
Contract Creation Transaction
Define LATEST_VERSION in a hard fork to be the latest supported VM
version. A contract creation transaction is always executed in
LATEST_VERSION (which means the code’s version is
LATEST_VERSION), and deploys contracts of LATEST_VERSION. Before a
contract creation transaction is executed, run validation on the
contract creation code. If it does not pass, return out-of-gas.
Precompiled Contract and Externally-owned Address
Precompiled contracts and externally-owned addresses do not have
version. If a message-call transaction or CALL / CALLCODE /
STATICCALL / DELEGATECALL touches a new externally-owned address
or a non-existing precompiled contract address, it is always created
with version field being 0.
Additional Fields in Account State RLP
In the future we may need to associate more information into an
account, and we already have some EIPs that define new additional
fields in the account state RLP. In this section, we define the
parsing strategy when additional fields are added.
Check the RLP list length, if it is 4, then set account version to
0, and do not parse any additional fields.
If the RLP list length more than 4, set the account version to the
scalar at position 4 (counting from 0).
Check version specification for the number of additional fields
defined N, if the RLP list length is not equal to 5 + N,
return parse error.
Parse RLP position 5 to 4 + N as the meaning specified in
additional fields.
Extensions
In relation to the above “Specification” section, we have defined the
base account versioning layer. The base account versioning layer is
already useful by itself and can handle most EVM improvements. Below
we define two specifications that can be deployed separately, which
improves functionality of base layer account versioning.
Note that this section is provided only for documentation
purpose. When “enabling EIP-1702”, those extensions should not be
enabled unless the extension specification is also included.
This section defines how other EIPs might use this account
versioning specification. Note that currently we only define the usage
template for base layer.
Account versioning is usually applied directly to a hard fork
meta. EIPs in the hard fork are grouped by the virtual machine
type, for example, EVM and eWASM. For each of them, we define:
Version: a non-zero scalar less than 2^256 that uniquely
identifies this version. Note that it does not need to be
sequential.
Parent version: the base that all new features derived
from. With parent version of 0 we define the base to be legacy
VM. Note that once a version other than 0 is defined, the legacy
VM’s feature set must be frozen. When defining an entirely new VM
(such as eWASM), parent version does not apply.
Features: all additional features that are enabled upon this
version.
If a meta EIP includes EIPs that provide additional account state RLP
fields, we also define:
Account fields: all account fields up to the end of this meta
EIP, excluding the basic 5 fields (nonce, balance,
storageRoot, codeHash and version). If EIPs included that are
specific to modifying account fields do not modify VM execution
logic, it is recommended that we specify an additional version whose
execution logic is the same as previous version, but only the
account fields are changed.
Rationale
This introduces account versioning via a new RLP item in account
state. The design above gets account versioning by making the contract
family always have the same version. In this way, versions are only
needed to be provided by contract creation transaction, and there is
no restrictions on formats of code for any version. If we want to
support multiple newest VMs (for example, EVM and WebAssembly running
together), then this will requires extensions such as 44-VERTXN and
45-VEROP.
Alternatively, account versioning can also be done through:
26-VER and
40-UNUSED: This makes an
account’s versioning soly dependent on its code header prefix. If
with only 26-VER, it is not possible to certify any code is valid,
because current VM allows treating code as data. This can be fixed
by 40-UNUSED, but the drawback is that it’s potentially backward
incompatible.
EIP-1891: Instead of writing version field into account RLP
state, we write it in a separate contract. This can accomplish the
same thing as this EIP and potentially reduces code complexity, but
the drawback is that every code execution will require an additional
trie traversal, which impacts performance.
Backwards Compatibility
Account versioning is fully backwards compatible, and it does not
change how current contracts are executed.
Discussions
Performance
Currently nearly all full node implementations uses config parameters
to decide which virtual machine version to use. Switching virtual
machine version is simply an operation that changes a pointer using a
different set of config parameters. As a result, this scheme has
nearly zero impact to performance.
WebAssembly
This scheme can also be helpful when we deploy on-chain WebAssembly
virtual machine. In that case, WASM contracts and EVM contracts can
co-exist and the execution boundary and interaction model are clearly
defined as above.
Test Cases and Implementations
To be added.
References
The source of this specification can be found at
43-VER.