There are a few things about block production that would be interesting to have per-validator metrics on:
- The skip rate (blocks produced, out of slots assigned in the leader schedule)
- Transaction fees received per block (so we have a better view of whether validators are profitable)
I’m thinking, we might add a separate thread to the maintainer daemon that:
- Keeps per-validator counters:
slots_assigned_total, blocks_produced_total, transaction_fee_lamports_total.
- Fetches the leader schedule
- In the main loop, if the current slot height passed over slots where one of the Lido validators was a leader:
- Increment
slots_assigned for those slots.
- Call
getBlock for those slots to see if the block was produced. If so, increment blocks_produced, and also add its transaction fees to transaction_fee_lamports_total.
- Expose that info as per-validator Prometheus metrics
Then we can compute:
- The skip rate over any given time period:
1 - rate(blocks_produced_total[30d]) / rate(slots_assigned_total[30d])
- Average tx fee per block:
sum(rate(transaction_fee_lamports_total[30d])) / sum(rate(slots_assigned_total[30d]))
This will miss the info for blocks that were produced while the daemon was not running. We could go further back in history, but I don’t see a way to reconcile a stateless deamon with that. For one, the RPC can only return the leader schedule for the current epoch, so we’d need to save the schedule to have access later. And also, we would need to save the counter values and turn them into gauges, to avoid double-counting at restart.
There are a few things about block production that would be interesting to have per-validator metrics on:
I’m thinking, we might add a separate thread to the maintainer daemon that:
slots_assigned_total,blocks_produced_total,transaction_fee_lamports_total.slots_assignedfor those slots.getBlockfor those slots to see if the block was produced. If so, incrementblocks_produced, and also add its transaction fees totransaction_fee_lamports_total.Then we can compute:
1 - rate(blocks_produced_total[30d]) / rate(slots_assigned_total[30d])sum(rate(transaction_fee_lamports_total[30d])) / sum(rate(slots_assigned_total[30d]))This will miss the info for blocks that were produced while the daemon was not running. We could go further back in history, but I don’t see a way to reconcile a stateless deamon with that. For one, the RPC can only return the leader schedule for the current epoch, so we’d need to save the schedule to have access later. And also, we would need to save the counter values and turn them into gauges, to avoid double-counting at restart.