Skip to content

Extend /eth/v1/node/peers with peer scoring and disconnect reasons#606

Open
barnabasbusa wants to merge 8 commits into
ethereum:masterfrom
barnabasbusa:bbusa/peer-scoring-spec
Open

Extend /eth/v1/node/peers with peer scoring and disconnect reasons#606
barnabasbusa wants to merge 8 commits into
ethereum:masterfrom
barnabasbusa:bbusa/peer-scoring-spec

Conversation

@barnabasbusa

@barnabasbusa barnabasbusa commented May 15, 2026

Copy link
Copy Markdown
Member

Summary

Extends the existing /eth/v1/node/peers and /eth/v1/node/peers/{peer_id} responses with four optional fields exposing per-peer scoring and disconnect information:

  • agent_version — the peer's libp2p identify agent string (e.g. Lighthouse/v8.1.3-...)
  • score — the client-native peer score
  • disconnect_reason — why the client last disconnected from the peer (controlled vocabulary)
  • downscore_reasons — the distinct set of reasons the peer was downscored this session (controlled vocabulary)

Two new string enums, PeerScoreReason and PeerDisconnectReason, define the controlled vocabularies.

Motivation

Peer scoring is one of the least observable parts of a running consensus node, yet it drives most peer churn. Every client already computes rich scoring internally but exposes it only through divergent, non-standard endpoints:

  • Lighthouse — GET /lighthouse/peers
  • Lodestar — GET /eth/v1/lodestar/lodestar_peer_score_stats
  • Teku — GET /teku/v1/nodes/peer_scores
  • Prysm — internal ScoreInfo proto (WIP REST endpoint)

This fragmentation means operators, monitoring tools, and testing frameworks (e.g. Kurtosis / interop debugging) can't ask "why is this node dropping peers?" in a client-agnostic way. Standardizing these fields lets tooling diagnose connectivity, fork/network mismatches, and rate-limiting uniformly across the network.

Design notes

  • Strictly additive. All four fields are optional, so existing consumers of /eth/v1/node/peers are unaffected.
  • Extends the existing schema rather than adding a new endpoint. An earlier revision proposed a dedicated /eth/v1/debug/node/peer_scores endpoint; feedback moved it onto the existing Peer object to avoid a parallel peer-listing API.
  • Native score is intentionally not normalized. Client score ranges differ widely ([-100,+100] for Lighthouse/Lodestar/Grandine, [-10,+20] for Teku, [0,1000] for Nimbus, [-100,+1] for Prysm), so score is only meaningful relative to other peers on the same client.
  • Controlled vocabularies with an unknown fallback. The reason enums capture the realistic cross-client union rather than a full taxonomy; clients map finer-grained internal tags onto the closest listed value, and consumers are told to tolerate unknown values for forward compatibility.

Introduces a new debug endpoint that returns the consensus client's
current per-peer scoring snapshot. Each entry includes the client-
native blended score, the per-subsystem score components the client
chooses to expose, the most recent score-affecting action (if any),
and the most recent disconnect (if any).

The schema is intentionally permissive about what each client
surfaces, because the visibility eth2 clients have into their own
scoring varies widely:

- `components` is a flexible map; `gossipsub` is optional because
  some clients (notably Nimbus) have no application-layer access to
  the underlying libp2p scores.
- `last_action` and `last_disconnect` are both optional - gossipsub-
  driven disconnects in particular often bypass the client's reason-
  capture path.
- `score_range` is required so consumers can normalize across
  clients whose native score ranges differ wildly: [-100, +100] for
  Lighthouse / Lodestar / Grandine, [-10, +20] for Teku, [0, 1000]
  for Nimbus, [-100, +1] for Prysm.

`PeerScoreReason` and `PeerDisconnectReason` are controlled
vocabularies that group the common cross-client causes; the original
client-side string is preserved in `native_reason` so consumers can
distinguish e.g. multiple `rpc_*` flavors that map to the same
controlled code.

Prior art motivating this proposal:
- Lighthouse `GET /lighthouse/peers`
- Lodestar `GET /eth/v1/lodestar/lodestar_peer_score_stats`
- Teku `GET /teku/v1/nodes/peer_scores`
- Prysm internal `ScoreInfo` proto + the WIP REST endpoint
  on `OffchainLabs/prysm:peer-scores-ui`
Matches the existing flat snake_case URL convention used by
/eth/v1/node/peer_count and the file name peer_scores.yaml. Prior
revision used /eth/v1/debug/node/peers/scores which mixed
subdirectory-style with the rest of the API.
Drop the separate /eth/v1/debug/node/peer_scores endpoint and the
PeerScore/PeerScoreRange/PeerScoreComponents/PeerScoreAction/
PeerDisconnectAction object hierarchy in favour of four optional
fields on the existing Peer schema: agent_version, score,
disconnect_reason, downscore_reasons.

Trim PeerScoreReason from 51 to 15 controlled values and
PeerDisconnectReason from 20 to 8, keeping the cross-client realistic
union rather than the full taxonomy. Implementations that compute
finer-grained internal tags are expected to map them onto the closest
listed value.

Strictly additive change to /eth/v1/node/peers and /eth/v1/node/peers/
{peer_id}: all new fields are optional so existing consumers are
unaffected.
@barnabasbusa barnabasbusa changed the title Bbusa/peer scoring spec feat: extend /eth/v1/node/peers with peer scoring and disconnect reasons May 15, 2026
@barnabasbusa barnabasbusa marked this pull request as draft May 15, 2026 07:44
barnabasbusa added a commit to barnabasbusa/lighthouse that referenced this pull request May 15, 2026
… state

Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Wrap the existing `last_disconnect()` lookup in both
the single-peer and list handlers so the field is omitted (None) for
connected/connecting peers.
barnabasbusa added a commit to barnabasbusa/lodestar that referenced this pull request May 15, 2026
… state

Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Compute the node-peer view first and only attach a
mapped `disconnectReason` when the resolved state matches.
barnabasbusa added a commit to barnabasbusa/teku that referenced this pull request May 15, 2026
…ing state

Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Teku exposes only `connected`/`disconnected` via
`Eth2Peer#isConnected()`, so suppress the field for connected peers.
barnabasbusa added a commit to barnabasbusa/prysm that referenced this pull request May 15, 2026
Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Only look up the last goodbye when the (lowercased)
peer state matches one of those values.
barnabasbusa added a commit to barnabasbusa/nimbus-eth2 that referenced this pull request May 15, 2026
Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Gate the `peer.lastDisconnectReason` lookup in both
the list and single-peer handlers on `peer.connectionState` so the
field is omitted (Opt.none) for connected/connecting peers.
barnabasbusa added a commit to barnabasbusa/grandine that referenced this pull request May 15, 2026
Per the proposed beacon-API spec
(ethereum/beacon-APIs#606), `disconnect_reason`
MUST only be populated when the peer's `state` is `disconnected` or
`disconnecting`. Only map `last_disconnect()` when the resolved
`PeerState` matches one of those variants.
@barnabasbusa barnabasbusa marked this pull request as ready for review June 11, 2026 09:50
Comment thread types/p2p.yaml
Comment thread types/p2p.yaml
$ref: "./p2p.yaml#/PeerConnectionState"
direction:
$ref: "./p2p.yaml#/PeerConnectionDirection"
agent_version:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might wanna do a v2 instead of adding new fields, but curious what others think

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im happy with new fields as long as they're not mandatory.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah none of this should be mandatory.
if we ever do a v2 we should have these mandatory.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this specific api it's fine to add new optional fields, it's also more of a debug api, but mostly it's fine because we don't or rather can't support ssz here because of the format/return values, I just wanna make it clear that this is not a general pattern we wanna have, or encourage

@rolfyone

rolfyone commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

I'm not sure how common it is, but we only appear to have 'large_penalty' and 'small_penalty' for downscore reasons, so an initial implementation from teku may not have that - are the other fields still useful if we can't easily provider downscore reason or disconnect reason @barnabasbusa ?
sample:

    {
      "peer_id": "...",
      "last_seen_p2p_address": "/ip4/0.0.0.0/udp/4477/quic-v1",
      "state": "connected",
      "direction": "inbound",
      "agent_version": "Prysm/v7.1.2/7950a249266a692551e5a910adb9a82a02c92040",
      "score": -483.74838746020384
    }

Comment thread CHANGES.md
Comment thread types/p2p.yaml Outdated
Comment on lines +127 to +142
enum:
- rpc_invalid_request
- rpc_invalid_response
- rpc_rate_limited
- rpc_timeout
- rpc_io_error
- rpc_bad_blocks_by_range
- rpc_bad_blocks_by_root
- gossip_invalid_block
- gossip_invalid_attestation
- gossip_invalid_blob_sidecar
- gossip_invalid_data_column_sidecar
- sync_bad_batch
- status_unviable_fork
- behaviour_penalty
- unknown

@nflaig nflaig Jun 25, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why these are enum instead of just example? it seems to me that we wanna keep this list open to extend and not restrict to these values, also not all clients might return all the examples from here

also, why is there no gossip_invalid_payload_envelope for example? and many others are missing, so if it's not an exhaustive list might as well limit it to just a few examples as guidance for implementers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they wanted a limited list i think so its semi-standard across all clients...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make it a limited list sure, but then someone has to take time and make it useful and relevant

there are 0 gloas related gossip errors besides the data column one, nothing related to bid, payload envelopes or ptc attestations

also why is there gossip_invalid_blob_sidecar? that doesn't make any sense, if we want a limited list, then it's needs to be well-defined and useful

Comment thread types/p2p.yaml
- rate_limited
- io_error
- client_shutdown
- unknown

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, consider using example

Comment thread types/p2p.yaml Outdated
Comment on lines +70 to +74
Client-native peer score. OPTIONAL. The scale and meaning is
implementation-defined - consumers SHOULD treat it as a relative
signal within a single client, not directly comparable across
clients. Lower values indicate worse standing. Clients that do
not maintain a per-peer score MAY omit this field.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we fine with having all these vibe coded descriptions? I definitely don't have time to polish this but if someone can it would be great

@nflaig

nflaig commented Jun 25, 2026

Copy link
Copy Markdown
Member
image

@barnabasbusa can you add some rationale to the PR body/description why we want/need this change for posterity this would be useful to have/persist here

@nflaig nflaig changed the title feat: extend /eth/v1/node/peers with peer scoring and disconnect reasons Extend /eth/v1/node/peers with peer scoring and disconnect reasons Jun 25, 2026
Comment thread types/p2p.yaml
downscore_reasons:
type: array
description: |
Reasons that the client has been down scored in their current session. OPTIONAL.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be added per occurrence, or once if it occurred at least once?

E.g., if gossip_invalid_block happened twice, should we return

["gossip_invalid_block", "gossip_invalid_block"]

or

["gossip_invalid_block"]

I assume the latter, but it might be worth clarifying this in the description.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would expect the latter. The description was a mess so i reduced it substantially because it was ai word slop... can potentially describe it as a distinct set...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants