Skip to content

feat(control): add runtime traffic metrics and node latency probing#968

Merged
kunish merged 3 commits intodaeuniverse:mainfrom
ksong008:feat/runtime-traffic-latency
Apr 18, 2026
Merged

feat(control): add runtime traffic metrics and node latency probing#968
kunish merged 3 commits intodaeuniverse:mainfrom
ksong008:feat/runtime-traffic-latency

Conversation

@ksong008
Copy link
Copy Markdown
Contributor

@ksong008 ksong008 commented Apr 17, 2026

Summary

This PR adds two runtime capabilities to dae control plane:

  • runtime traffic metrics with rolling history
  • node latency probing support

It also includes a responsiveness fix for runtime traffic updates.

What changed

  • add runtime traffic statistics collection in control plane
  • keep rolling runtime history instead of unbounded growth
  • expose upload / download rate and total traffic related snapshots for runtime consumers
  • improve runtime traffic update responsiveness
  • add node latency probing support in outbound dialer / control plane
  • provide control-plane access to node latency probe results

Runtime traffic metrics

This PR introduces runtime stats tracking for live traffic observation.

Highlights:

  • track upload and download activity continuously during runtime
  • keep recent history in a bounded rolling window
  • support current-rate style snapshots as well as accumulated totals
  • avoid unbounded memory growth by trimming old buckets

Node latency probing

This PR adds control-plane support for latency probing of configured nodes.

Highlights:

  • add probing logic for outbound nodes
  • expose probe results through control-plane snapshot / trigger methods
  • provide a foundation for higher-layer consumers such as dae-wing and daed

Fix included

This PR also includes a responsiveness improvement for runtime traffic updates so runtime stats react faster under active traffic.

Why

These changes are intended to support higher-level runtime dashboards and orchestration UIs that need:

  • live traffic overview
  • bounded recent traffic history
  • node latency measurements
  • explicit probe triggering and result snapshots

Scope

This PR only adds the runtime/control-plane capability in dae.

Follow-up changes in upper layers are expected separately:

  • dae-wing will consume these capabilities and expose them via API
  • daed will consume the API for UI presentation

Manual verification

  • start dae and confirm runtime traffic stats are updated while traffic is active
  • confirm traffic history remains bounded over time
  • confirm runtime totals continue accumulating during process lifetime
  • trigger node latency probing and confirm results are produced
  • verify no obvious regression in normal runtime behavior

Related follow-up

This PR is intended to be consumed by follow-up dae-wing / daed changes for:

  • runtime traffic overview
  • node latency testing / display

Tested locally.

Manual verification completed:

  • verified runtime traffic metrics update while traffic is active
  • verified runtime traffic history remains bounded instead of growing unbounded
  • verified runtime totals continue accumulating during process lifetime
  • verified node latency probing can be triggered successfully
  • verified latency probe results are available from the control plane
  • verified no obvious regression in basic runtime behavior during manual testing

Please add the tested label if needed.

@ksong008 ksong008 requested a review from a team as a code owner April 17, 2026 12:02
@ksong008 ksong008 changed the title Feat/runtime traffic feat(control): add runtime traffic metrics and node latency probing feat(control): add runtime traffic metrics and node latency probing Apr 17, 2026
@ksong008
Copy link
Copy Markdown
Contributor Author

Tested locally.

Manual verification completed:

  • verified runtime traffic metrics update while traffic is active
  • verified runtime traffic history remains bounded instead of growing unbounded
  • verified runtime totals continue accumulating during process lifetime
  • verified node latency probing can be triggered successfully
  • verified latency probe results are available from the control plane
  • verified no obvious regression in basic runtime behavior during manual testing

Please add the tested label if needed.

@kunish kunish added the tested label Apr 18, 2026
Copy link
Copy Markdown
Contributor

@dae-prow dae-prow Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧪 Since the PR has been fully tested, please consider merging it.

@kunish kunish merged commit 39831d5 into daeuniverse:main Apr 18, 2026
30 checks passed
@ksong008 ksong008 deleted the feat/runtime-traffic-latency branch April 18, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants