Skip to content

Conversation

@divyanshk
Copy link
Contributor

@divyanshk divyanshk commented Jan 27, 2026

Summary

Central source of truth — Consolidates scattered CUPTI callback metadata (flow correlation, blocklist status) into a single registry. This will replace hardcoded checks throughout the codebase.

Minimal API — Exposes only 2 methods: requiresFlowCorrelation(domain, cbid) for CPU→GPU trace arrows and isBlocklisted(domain, cbid) for filtering noisy callbacks. Although we can add more, the overall idea should be to keep it minimal.

O(1) lookups — Uses separate std::unordered_map per domain (RUNTIME/DRIVER) with range support for memory operations, initialized lazily as a singleton.

Usage — TBD but this would be initialized before the profiling stage when deciding which callbacks to hook on to; and then used later when activity buffers are collected and are to be processed.

This PR in itself brings no material change to kineto. Future PR will refactor files to make use of this registry.

Note: this is different from CuptiCallbackApi's registerCallback which register generic function pointers to be called when CUPTI fires. With CuptiCallbackRegistry, we want to store metadata/properties about CUPTI callbacks to help with post-processing (filtering, flow correlation checks, etc).

@meta-cla meta-cla bot added the cla signed label Jan 27, 2026
@scotts
Copy link
Contributor

scotts commented Jan 27, 2026

Can you show some examples of what using this new code? That is, this is the foundation for a future refactoring. Can you show some before and after for that refactoring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants