Standardized names of common metrics for Ethereum clients to use with Prometheus, a widely used monitoring and alerting solution.
Abstract
Many Ethereum clients expose a range of metrics in a format compatible with Prometheus to allow operators to monitor the client’s behaviour and performance and raise alerts if the chain isn’t progressing or there are other indications of errors.
While the majority of these metrics are highly client-specific, reporting on internal implementation details of the client, some are applicable to all clients.
By standardizing the naming and format of these common metrics, operators are able to monitor the operation of multiple clients in a single dashboard or alerting configuration.
Motivation
Using common names and meanings for metrics which apply to all clients allows node operators to monitor clusters of nodes using heterogeneous clients using a single dashboard and alerting configuration.
Currently there are no agreed names or meanings, leaving client developers to invent their own making it difficult to monitor a heterogeneous cluster.
Specification
The table below defines metrics which may be captured by Ethereum clients which expose metrics to Prometheus. Clients may expose additional metrics however these should not use the ethereum_ prefix.
Name
Metric type
Definition
JSON-RPC Equivalent
ethereum_blockchain_height
Gauge
The current height of the canonical chain
eth_blockNumber
ethereum_best_known_block_number
Gauge
The estimated highest block available
highestBlock of eth_syncing or eth_blockNumber if not syncing
ethereum_peer_count
Gauge
The current number of peers connected
net_peerCount
ethereum_peer_limit
Gauge
The maximum number of peers this node allows to connect
No equivalent
Note that ethereum_best_known_block_number always has a value. When the eth_syncing JSON-RPC method would return false, the current chain height is used.
Rationale
The defined metrics are independent of Ethereum client implementation but provide sufficient information to create an overview dashboard to support monitoring a group of Ethereum nodes.
There is a similar, though more prescriptive, specification for beacon chain client metrics.
The specific details of how to expose the metrics has been omitted as there is variance in existing implementations and standardising this does not provide any significant benefit.
Backwards Compatibility
This is not a consensus affecting change.
Clients may already be publishing these metrics using different names and changing to the new form may break existing alerts or dashboards. Clients that want to avoid this incompatibility can expose the metrics under both the old and new names.
Clients may also be publishing metrics with a different meaning using these names. Backwards compatibility cannot be preserved in this case.
Implementation
Pantheon switched to using these standard metric names in its 1.2 release: https://github.com/PegaSysEng/pantheon/pull/1634.