Skip to content

Flashring externalize#267

Open
nileshsolankimeesho wants to merge 476 commits into
feat/ssd-cache-multi-shard-vmfrom
flashring-externalize
Open

Flashring externalize#267
nileshsolankimeesho wants to merge 476 commits into
feat/ssd-cache-multi-shard-vmfrom
flashring-externalize

Conversation

@nileshsolankimeesho
Copy link
Copy Markdown

🔁 Pull Request Template – BharatMLStack

Please fill out the following sections to help us review your changes efficiently.


📌 Summary

e.g., Adds optimizes Redis fetch latency in online-feature-store, or improves search UI responsiveness in trufflebox-ui.


📂 Modules Affected

  • horizon (Real-time systems / networking)
  • online-feature-store (Feature serving infra)
  • trufflebox-ui (Admin panel / UI)
  • infra (Docker, CI/CD, GCP/AWS setup)
  • docs (Documentation updates)
  • Other: ___________

✅ Type of Change

  • Feature addition
  • Bug fix
  • Infra / build system change
  • Performance improvement
  • Refactor
  • Documentation
  • Other: ___________

📊 Benchmark / Metrics (if applicable)

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (1)
  • develop

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10fed8ce-f2f4-4c55-81cf-c56dc896d056

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.


// Init initializes the metrics client
func Init() {
if initialized {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can simply use once, another variable initialized is not needed

}

func (ma *MetricAverager) Add(value float64) {
ma.mu.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of this MetricAverager, We should directly publish metric, This is just adding 1 more unnecessary locking step.

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
// Start a goroutine for each metric channel
mc.wg.Add(12)

go mc.collectShardMetric(mc.channels.RP99, "RP99")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These go routines wont be needed if we publish metric directly, saving cpu in the pod

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
RecordRewrites(shardIdx int, value int64)

// Per-shard observation metrics
RecordRP99(shardIdx int, value time.Duration)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one we do a metric.Inc datadog itself provides the histogram values, we do not need to Record them explicitely

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
Rewrites int64
RP99 time.Duration
RP50 time.Duration
RP25 time.Duration
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to record explicitely for both Read and Write

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
Puts chan ShardMetricValue
Hits chan ShardMetricValue
ActiveEntries chan ShardMetricValue
ExpiredEntries chan ShardMetricValue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we directly publish the metric, we do not need to create these channels, Again this will help in saving CPU

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
mc.instantMetrics[shardIdx][name] = 0
}

mc.instantMetrics[shardIdx]["KeyNotFoundCount"] = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be directly Incr from the method itself

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
if !ok {
return
}
mc.mu.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also we are taking lock which should be needed

Comment thread flashring/pkg/metrics/runmetrics.go Outdated
shardMetrics[shardIdx] = ShardMetrics{
RP99: time.Duration(instants["RP99"]),
RP50: time.Duration(instants["RP50"]),
RP25: time.Duration(instants["RP25"]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

managed by datadog

Comment thread flashring/pkg/metrics/statsd_logger.go Outdated
select {
case <-metricsCollector.stopCh:
return
case <-ticker.C:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we are running self ticker, datadog itself publishes metric in every 30s or so, so we do not need to explicitely run a ticker and leave it to be managed by datadog

azmat-meesho and others added 24 commits February 3, 2026 13:36
fix(interaction-store): make metadata update synchronous
* Added caller level metric in distributed cache, db

* feat(online-feature-store): add caller_id and db (redis/scylla) to feature.retrieve.db.caller.requests metric

Co-authored-by: Cursor <cursoragent@cursor.com>

* Added confid as metric tag

* changed inc factor

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
* Added otd path changes for Delta Model

* Fix invalid caching checks
nileshsolankimeesho and others added 30 commits March 30, 2026 10:40
Activity Contracts for Workflow Orchestrator
Workflow Orchestrator -> Plugin Env Support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.