Compute operations to reprice in EIP-7904

Maria Silva, January 2026

In this report, we present which operations should have a cost increase with EIP-7904 and describe the methodology to pick them. The analysis can be reproduced in the 0.5-gasbench_data_eda notebook.

Methodology

Data

The raw benchmark data was generated by running the EEST benchmark suite with the Nethermind benchmarking tooling. The database is hosted on a PostgreSQL server managed by the Nethermind team.

Each test was run multiple times to isolate random variations in runtime and outliers. The data was collected between 2026-01-05 and 2026-01-22.

The test are still using the Prague fork. A similar analysis is still needed for the Osaka fork.

All benchmarks were run on the performance branches of each client using the following hardware specification:

Specification Value
spec_processor_type x86_64
spec_system_os Linux
spec_kernel_release 6.8.0-53-generic
spec_kernel_version #55-Ubuntu SMP PREEMPT_DYNAMIC
spec_machine_arch x86_64
spec_processor_arch 64bit
spec_cpu_model AMD EPYC 7713 64-Core Processor
spec_num_cpus 32

Data processing

The raw benchmark data was processed as follows:

  1. Parsing test metadata: Each test title was parsed to extract the test file, test name, test parameters, fork version, and the target opcode or precompile being tested.

  2. Filtering invalid data: Tests with execution time of 0ms were excluded. The ethrex client was also excluded from the analysis as it is still in early development.

  3. Removing outliers: For each (client, test, opcode) combination, outliers were identified using the interquartile range (IQR) method:
    • Lower threshold: Q1 - 1.5 × IQR
    • Upper threshold: Q3 + 1.5 × IQR
    • Data points outside these thresholds were flagged as outliers and excluded from the worst-case analysis.
  4. Computing worst-case performance: For each (client, test, opcode) combination, the minimum non-outlier MGas/s value was selected to represent the worst-case execution performance.

  5. Aggregating by opcode: For each opcode, the worst-performing test across all clients was identified, along with the second-worst client’s performance on that same test.

Selecting candidates

Operations were selected as candidates for repricing based on the following criteria:

  1. Performance threshold: The worst-case MGas/s must be below 60 MGas/s. By increase the costs of these operations, we will be able to increase our base throughput 3x from our current 20Mgas/s.

  2. Multi-client validation: To avoid penalizing all clients for a single client’s implementation inefficiency, the second-worst client’s performance is also considered. If the second-worst client achieves significantly better performance (>20% above the threshold), the operation is flagged for client optimization rather than repricing.

  3. Excluding very slow tests: Tests with worst-case MGas/s below 20 MGas/s were analyzed separately to understand if the slowness is due to specific test parameters or possible errors in data. After Osaka, we are running at 20Mgas/s, so we should not observe test with values lower than this.

Performance results

Run times distribution by client

The benchmark data covers 237,733 test runs across 5 clients (Besu, Erigon, Geth, Nethermind, and Reth) on the Prague fork. The overall distribution of MGas/s shows significant variation across tests and clients.

mgas_distribution_by_client

The boxplot above shows that different clients have different performance characteristics:

  • Nethermind and Reth the highest median performance
  • Besu shows a wider distribution with a majority of tests at lower MGas/s values
  • Erigon and Geth have intermediate performance, with still some slower tests, but a higher median performance than Besu.

Worst vs. second-worst client

An important consideration for repricing is whether poor performance is isolated to a single client or affects multiple implementations. This is important for distinguishing between:

  • Operations that genuinely need repricing (multiple clients struggle)
  • Operations where a specific client needs optimization (only one client struggles)

The chart shows the number of tests in which each clietn was the worst performer. We can see that Besu is the worst performer in the majority of tests, followed by Erigon.

worst_client_countt

The next plot shows the distribution of the performance ratio between the worst client and second-worst client. Each boxplot shows this distribution by the client (i.e., when each client is the worst performer).

worst_second_worst_gap

For a majority of tests, the gap between worst and second-worst is small (<20%), suggesting that the worst client is not significantly underperforming on relation to the other clients. However, we do see some tests with the gas is much wider. These test are more frequent for when Besu is the worst client.

Underpriced operations at 60 MGas/s

The following table shows operations with worst-case performance below 60 MGas/s:

Operation Type Worst MGas/s Worst Client Second Worst MGas/s
MULMOD opcode 20.60 besu 57.02
MODEXP precompile 21.63 geth 25.06
EQ opcode 22.80 besu 122.96
SDIV opcode 23.14 besu 85.18
REVERT opcode 23.38 besu 110.41
SMOD opcode 24.99 besu 67.98
MOD opcode 25.27 besu 70.98
SAR opcode 27.15 besu 134.36
MUL opcode 27.60 besu 147.87
SUB opcode 28.80 besu 122.66
DIV opcode 29.94 besu 88.54
SHIFT opcode 30.47 besu 124.32
point evaluation precompile 31.75 erigon 31.85
RETURN opcode 32.95 besu 103.85
ADDMOD opcode 32.98 besu 91.12
CALLCODE opcode 34.78 besu 112.96
CALLDATALOAD opcode 35.52 besu 77.65
CALL opcode 35.60 besu 98.94
DELEGATECALL opcode 36.30 besu 127.70
SELFDESTRUCT opcode 36.65 besu 628.81
STATICCALL opcode 37.60 besu 105.77
CALLDATACOPY opcode 38.08 besu 193.12
KECCAK opcode 38.49 besu 70.90
SHL opcode 39.64 besu 136.58
SHR opcode 41.84 besu 134.38
BLS12_G1ADD precompile 41.97 besu 73.64
XOR opcode 47.34 besu 122.21
blake2f precompile 47.48 reth 50.07
ecAdd precompile 47.89 besu 63.51
BLS12_G2ADD precompile 49.01 besu 69.11
SHA2-256 precompile 52.29 besu 235.32
AND opcode 54.66 besu 122.87
identity precompile 54.74 besu 178.01
OR opcode 54.91 besu 126.59
ecRecover precompile 55.04 besu 58.41
TLOAD opcode 55.90 erigon 789.42
CALLDATASIZE opcode 56.91 besu 134.27
MSTORE opcode 57.07 besu 145.72
ecPairing precompile 57.34 nethermind 67.85
ecMul precompile 58.66 reth 90.32

This table is then split into two categories below: Candidates for repricing (where multiple clients struggle) and Operations requiring client optimization (where only one client struggles).

As expected, there are a significant number of operations where Besu is performing bellow 60Mgas/s, but the rest of the clients have a significantly higher performance. These are the likely cases where a single client optimization is needed.

MODEXP is still performing at 21.63 Mgas/s, however we expect this value to change in the Osaka branch as this operation was already repriced there. We need to run this again on the newest fork.

Slow tests

We also observed test performing at less than 20Mgas/s. Since there are likely issues in the data, we exclude them from the underpriced operations. However, we need to confirm whether this is actually issues in the test and not a new bottleneck. The test in question are the following:

  • test_extcode_ops in Geth and Erigon
  • test_xcall in Geth
  • test_artimetic in Besu (only opcode_ADD)
  • test_selfbalance in Besu (only contract_balance_0)
  • test_call_frame_context_ops in Besu (opcode_ORIGIN)

Final list

Candidates for repricing

The following operations are candidates for repricing under EIP-7904. These are operations where:

  • The worst-case MGas/s is below 60 MGas/s, AND
  • The second-worst client also performs below 72 MGas/s (60 × 1.2), indicating the poor performance is not isolated to a single client
Operation Type Worst MGas/s Worst Client Second Worst MGas/s Second Worst / Worst
MULMOD opcode 20.60 besu 57.02 2.77×
MODEXP precompile 21.63 geth 25.06 1.16×
SMOD opcode 24.99 besu 67.98 2.72×
MOD opcode 25.27 besu 70.98 2.81×
point evaluation precompile 31.75 erigon 31.85 1.00×
KECCAK opcode 38.49 besu 70.90 1.84×
BLS12_G1ADD precompile 41.97 besu 73.64 1.75×
blake2f precompile 47.48 reth 50.07 1.05×
ecAdd precompile 47.89 besu 63.51 1.33×
BLS12_G2ADD precompile 49.01 besu 69.11 1.41×
ecRecover precompile 55.04 besu 58.41 1.06×
ecPairing precompile 57.34 nethermind 67.85 1.18×

Operations requiring client optimization

The following operations have poor performance on a single client but acceptable performance on others. These should be addressed through client optimization rather than protocol-level repricing:

Operation Type Worst MGas/s Worst Client Second Worst MGas/s Gap
EQ opcode 22.80 besu 122.96 5.4×
SDIV opcode 23.14 besu 85.18 3.7×
REVERT opcode 23.38 besu 110.41 4.7×
SAR opcode 27.15 besu 134.36 4.9×
MUL opcode 27.60 besu 147.87 5.4×
SUB opcode 28.80 besu 122.66 4.3×
DIV opcode 29.94 besu 88.54 3.0×
SHIFT opcode 30.47 besu 124.32 4.1×
RETURN opcode 32.95 besu 103.85 3.2×
ADDMOD opcode 32.98 besu 91.12 2.8×
CALLCODE opcode 34.78 besu 112.96 3.2×
CALLDATALOAD opcode 35.52 besu 77.65 2.2×
CALL opcode 35.60 besu 98.94 2.8×
DELEGATECALL opcode 36.30 besu 127.70 3.5×
SELFDESTRUCT opcode 36.65 besu 628.81 17.2×
STATICCALL opcode 37.60 besu 105.77 2.8×
CALLDATACOPY opcode 38.08 besu 193.12 5.1×
SHL opcode 39.64 besu 136.58 3.4×
SHR opcode 41.84 besu 134.38 3.2×
XOR opcode 47.34 besu 122.21 2.6×
SHA2-256 precompile 52.29 besu 235.32 4.5×
AND opcode 54.66 besu 122.87 2.2×
identity precompile 54.74 besu 178.01 3.3×
OR opcode 54.91 besu 126.59 2.3×
TLOAD opcode 55.90 erigon 789.42 14.1×
CALLDATASIZE opcode 56.91 besu 134.27 2.4×
MSTORE opcode 57.07 besu 145.72 2.6×
ecMul precompile 58.66 reth 90.32 1.5×

The next step is to reach out to the individual clients and assess the reason for the slow performance and whether it can be improved by Glamsterdam.

Client feedback

The Besu team provided the following feedback:

  • They are not yet able to reproduce the numbers for EQ, so more analysis here is needed.
  • They are already working on optimizing arithmetic operations (e.g. MOD, SMOD, ADDMOD, MULMOD, DIV, SDIV, MUL and SUB). They expect the performance to improve with new Uint256 implementation.
  • They proposed to add to the repricing table all the DIV related opcodes as they are more complex than simple arithmetic like ADD and based on the same algorithm. The list of these operations is MOD, SMOD, ADDMOD, MULMOD, DIV, and SDIV.