Alert Source Discuss
⚠️ Draft Standards Track: Core

EIP-7834: Separate Metadata Section for EOF

Introduce a new separate metadata section to the EOF

Authors Kaan Uzdogan (@kuzdogan), Marco Castignoli (@marcocastignoli), Manuel Wedler (@manuelwedler)
Created 2024-12-06
Discussion Link https://ethereum-magicians.org/t/eip-7834-separate-metadata-section-for-eof/22138
Requires EIP-3540

Abstract

Introduce a new separate metadata section to the Ethereum Object Format (EOF) that is unreachable by the code, and any changes to which does not affect the code.

Motivation

It is desirable to include metadata in contract’s bytecode for various reasons. For instance, both the Solidity and Vyper compilers by default include the language and compiler version used to compile. Vyper (with 0.4.1) appends an integrity hash to the initcode in CBOR encoding. Solidity additionally includes the IPFS or the Swarm hash of the Solidity contract metadata.json file, and the experimental Solidity flag. The current (pre-EOF) practice is to append this CBOR encoded metadata section in the contract’s runtime bytecode, followed by the 2 bytes length of the CBOR encoded bytes.

        Solidity     ┌──────────────────────────────────────────0x0033 bytes──────────────────────────────────────────────┐
...7265206c656e677468a2646970667358221220dceca8706b29e917dacf25fceef95acac8d90d765ac926663ce4096195952b6164736f6c634300060b0033

This poses a problem for source code verification where the onchain bytecode is compared to the compiled bytecode of the given source code. During a contract verification, metadata sections, in particular the IPFS hash, need to be ignored and only the executional bytecode should be compared. Since pre-EOF bytecode is not structured, it is not possible to distinguish the metadata section from the executional bytecode easily. This gets even trickier in the case of factory contracts with multiple nested bytecodes, each having their own metadata sections. Verifiers need to implement their own heuristics and workarounds to find the metadata sections and ignore it.

The EOF brings structure to the bytecode by separating the code from the data, and placing the code of each contract in their respective containers. In its current form, this makes it possible to find the data easier than the pre-EOF bytecode. However, the current spec also does not describe a metadata section. Compilers currently need to place the contract metadata inside the data section which poses several problems:

  1. It is not straightforward to distinguish the metadata part in the data_section, which poses the same problem as the pre-EOF bytecode.
  2. Any change to the metadata’s size within the data section will change the executional bytecode, e.g. through shifting DATALOADN offsets. With that, two identical contracts with different metadata sizes will not match during source code, since the code will be different.
  3. The metadata can theoretically be reached by the code, e.g. via manipulating the DATALOADN instructions.

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Extending the format introduced in EIP-3540, this EIP proposes to add a new OPTIONAL section in the body called metadata_section before the data_section, and to add two new OPTIONAL fields kind_metadata (value: 0x05) and metadata_size to the header before the kind_data and data_size fields.

container := header, body
header :=
    magic, version,
    kind_type, type_size,
    kind_code, num_code_sections, code_size+,
    [kind_container, num_container_sections, container_size+,]
    [kind_metadata, metadata_size,]
    kind_data, data_size,
    terminator
body := types_section, code_section+, container_section*, [metadata_section], data_section
types_section := (inputs, outputs, max_stack_height)+
name length value description
kind_metadata 1 byte 0x05 kind marker for metadata size section
metadata_size 2 bytes 0x0001-0xFFFF 16-bit unsigned big-endian integer denoting the length of the metadata section content
kind_data 1 byte 0x04 kind marker for data size section
data_size 2 bytes 0x0000-0xFFFF 16-bit unsigned big-endian integer denoting the length of the data section content (*)
terminator 1 byte 0x00 marks the end of the header

Body

name length value description
metadata_section variable n/a arbitrary sequence of bytes
data_section variable n/a arbitrary sequence of bytes

The strucure and the encoding of the metadata_section is not defined by this EIP. It is left to the compilers, tooling, or the contract developers to define the encoding and the content. The current practice by the Solidity and Vyper compilers is to use CBOR encoding.

Rationale

The metadata_section in the body, as well as the kind_metadata and metadata_size fields in the header, are OPTIONAL. This way, the compilers can avoid additional bytes in the container if they don’t want to write any metadata. The data_section can change in its size and content during deployment, therefore it needs to be REQUIRED, even if the data is empty. The metadata_section is not expected to change during the deployment.

The reason for placing the metadata_section before the data_section, and assigning kind_metadata the value 0x05 (and not 0x04) is to make it easier for the existing EOF tooling adapt the changes. Additionally, if the metadata_section was placed after the data_section, changes to the data_section in deploy time would cause the metadata_section to shift. By placing the metadata_section before, this could be mitigated.

Backwards Compatibility

No backward compatibility issues are expected since EIP-3540 is not implemented yet.

Security Considerations

No security considerations as this section is meant not to be executed.

Copyright and related rights waived via CC0.

Citation

Please cite this document as:

Kaan Uzdogan (@kuzdogan), Marco Castignoli (@marcocastignoli), Manuel Wedler (@manuelwedler), "EIP-7834: Separate Metadata Section for EOF [DRAFT]," Ethereum Improvement Proposals, no. 7834, December 2024. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-7834.