Move TRT-RTX runtime controls to runtime context managers (v3, for review) by tp5uiuc · Pull Request #3 · tp5uiuc/TensorRT

tp5uiuc · 2026-06-03T17:18:42Z

Rewrites the v2 design (PR #2 base branch) to move cuda_graph_strategy, dynamic_shapes_kernel_specialization_strategy, runtime_cache from CompilationSettings / serialized engine slots to runtime context managers per pytorch#4310.

Summary

New RuntimeSettings dataclass on both Python and C++ sides; RuntimeCacheHandle registered as a torchbind class for shared-cache semantics.
Three new CMs in torch_tensorrt.runtime: runtime_config (pool API), runtime_cache (shared cache), plus per-knob sugars. All accept a list of modules.
New runtime_settings= kwarg on compile() / cross_compile_for_windows() / convert_module() for compile-time hints (1 context-create cost, no enter/exit recreate).
Per-engine update_runtime_settings(rs) with fast-path equality check; rebuilds IRuntimeConfig + recreates execution context on diff.
SerializedInfoIndex drops 4 RTX slots; SERIALIZATION_LEN back to 12.

Tests

New test_004_runtime_settings.py (12 tests) covering data model, compile-time hint, CM restore, multi-target, dispatch.
test_000_runtime_cache.py, test_001_dynamic_shapes_kernel_strategy.py, test_001_cuda_graph_strategy.py migrated to the new API.

Status

Pre-commit clean (SKIP=mypy for the pre-existing _TRTEngine.py errors tracked separately).
RTX wheel build succeeds; test_004 all 12 pass; Python-runtime half of the three other test files passes.
C++-engine path crashes inside libtensorrt_rtx.so.1 at cuda_engine->getStreamableWeightsSize() -- I confirmed this is a pre-existing environmental issue on the test node (the same crash occurs with a known-good pre-built v2 wheel installed in the same env), not a regression from this refactor.

…anagers Replaces the v2 design that packed three runtime-mode controls (``cuda_graph_strategy``, ``dynamic_shapes_kernel_specialization_strategy``, ``runtime_cache``) into ``CompilationSettings`` and the serialized engine tuple. Per pytorch#4310, these are runtime mode controls -- not engine properties -- and shouldn't pin at compile time or round-trip through serialization. Highlights: * New ``RuntimeSettings`` dataclass on both Python and C++ sides (``py/torch_tensorrt/runtime/_runtime_settings.py``, ``core/runtime/RuntimeSettings.h``). Three fields: ``dynamic_shapes_kernel_specialization_strategy``, ``cuda_graph_strategy``, ``runtime_cache``. The cache field accepts ``None``, a path string (engine creates an implicit handle, saves on ``__del__``, mirrors old ``runtime_cache_path=`` behavior), or a ``RuntimeCacheHandle`` (shared cache, lifecycle owned by the ``runtime_cache()`` CM). * New ``RuntimeCacheHandle`` registered as a torchbind class (``torch.classes.tensorrt.RuntimeCacheHandle``) so the same C++ ``IRuntimeCache`` shared_ptr crosses the Python/C++ boundary. * New per-engine ``update_runtime_settings`` API on both ``TRTEngine`` flavors. Fast-paths on settings equality; eagerly rebuilds ``IRuntimeConfig`` + recreates execution context on diff. * Three new context managers in ``torch_tensorrt.runtime``: ``runtime_config(target_or_targets, **kw)`` (the pool API; also yields the target so ``with runtime_config(model, ...) as m:`` works), ``runtime_cache(target, path)`` (shared cache CM), and the per-knob sugars ``set_cuda_graph_strategy`` / ``set_dynamic_shapes_kernel_strategy``. All three accept a list of modules for multi-target use; the cache CM yields the ``RuntimeCacheHandle`` for inspection or explicit ``save()``. * New ``runtime_settings=`` kwarg on ``compile()``, ``cross_compile_for_windows()``, and ``convert_module()`` so callers can prime the engine with the right values up front. Compile-time hint avoids the enter/exit recreate cost. * ``CompilationSettings`` loses the three fields; the compiler entry points drop the three kwargs. ``SerializedInfoIndex`` drops the four RTX-related slots; ``SERIALIZATION_LEN`` returns to 12. Engines saved with the old 16-slot layout will raise the existing layout-mismatch error on load. * Three existing test files migrated to the new API; new ``tests/py/dynamo/runtime/test_004_runtime_settings.py`` covers the data model, compile-time hint, runtime CM restore semantics, multi-target form, and dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Two follow-up bugs exposed by the cross-runtime test parameterization on the C++ engine path: 1. ``torch.classes.tensorrt.Engine.update_runtime_settings(...)`` rejected Python ``None`` for the ``RuntimeCacheHandle`` argument because TorchBind does not auto-convert ``None`` to a null ``c10::intrusive_ptr``. Switch the signature to ``c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>>`` so the default ``runtime_cache=None`` case round-trips cleanly. 2. ``RuntimeSettings(runtime_cache="/some/path")`` only auto-saved to disk on engine destruction for the Python runtime (via ``_TRTEngine.__del__``). The C++ engine had no equivalent saver and the IRuntimeCache it materialized internally wasn't accessible from Python. Make the cpp path symmetric: - Expose ``serialize() -> at::Tensor`` / ``deserialize(at::Tensor)`` / ``has_cache()`` on the torchbind ``RuntimeCacheHandle`` class. ``at::Tensor`` of uint8 is used instead of ``std::string`` because TorchBind forces ``std::string`` through Python ``str`` (UTF-8) and serialized cache bytes are not valid UTF-8. - In ``TorchTensorRTModule.setup_engine`` (cpp branch), pre-materialize a torchbind handle when ``runtime_cache`` is a path string, store it on the module, and substitute it into ``_runtime_settings`` so the dispatch passes the same handle through. - Add ``_load_cpp_implicit_cache`` / ``_save_cpp_implicit_cache`` and a module ``__del__`` that mirrors the Python ``_TRTEngine`` saver, with ``filelock`` + atomic-rename semantics. - Teach ``_to_torchbind_handle`` to pass an already-torchbind ``torch.ScriptObject`` through unchanged. All cpp + python runtime tests pass on TRT-RTX 1.5: test_004 (12/12), test_000 (10/10), test_001 dynamic_shapes (14/14), test_001 cuda_graph (13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tp5uiuc · 2026-06-03T22:36:40Z

+    runtime_cache: Optional[Union[str, "RuntimeCacheHandle"]] = None  # noqa: F821
+
+    def __post_init__(self) -> None:
+        if (


The _runtime_settings is only expected to be used in TRT-RTX. For regular TRT, let's issue a warning here saying this has no effect.

Done in 4bfa982. RuntimeSettings.__post_init__ now emits a one-shot UserWarning on non-RTX builds (gated by a module-level _NON_RTX_WARNING_EMITTED so it does not spam):

global _NON_RTX_WARNING_EMITTED if not ENABLED_FEATURES.tensorrt_rtx and not _NON_RTX_WARNING_EMITTED: warnings.warn( "RuntimeSettings is only honored on TRT-RTX builds; " "values will be ignored on standard TRT.", UserWarning, stacklevel=2, ) _NON_RTX_WARNING_EMITTED = True

…timeCacheHandle lifecycle Structural cleanup on top of the v3 work (no observable behavior change). C++ side -------- ``RuntimeSettings`` migrates from a ``TRTEngine`` member to a ``TRTRuntimeConfig`` member -- the value-type now lives with its primary consumer (the IRuntimeConfig builder). ``TRTRuntimeConfig`` gains ``set_settings()`` (the diff-and-invalidate primitive) and turns the static ``uses_internal_capture`` / ``is_monolithic_capturable`` helpers into instance methods so callers do not need to pass settings around. ``TRTEngine::runtime_settings()`` forwards through. Python side ----------- Introduces a Python ``TRTRuntimeConfig`` class mirroring the C++ struct. ``_TRTEngine`` drops its three legacy fields (``runtime_config``, ``runtime_settings``, ``_implicit_cache_handle``) for a single ``self._trt_runtime_config`` member; ``_create_execution_context`` / ``update_runtime_settings`` / ``_is_monolithic_capturable`` / ``_enable_rtx_native_cudagraphs`` all delegate. Every ``ENABLED_FEATURES.tensorrt_rtx`` branch related to runtime-mode controls is absorbed into the shim, so engine and module call sites stay uniform across TRT and TRT-RTX builds. Following the project's grouping convention, ``py/torch_tensorrt/runtime/_runtime_settings.py`` is merged into ``_runtime_config.py``; that file now holds ``RuntimeSettings``, the new ``TRTRuntimeConfig``, the existing ``runtime_config()`` CM, and its factory. Imports across the tree are repointed. RuntimeCacheHandle ownership model ---------------------------------- Save-on-destruction moves from the two engine-side ``__del__`` paths (``_TRTEngine.close()`` for Python runtime, ``TorchTensorRTModule.__del__`` for cpp runtime) onto ``RuntimeCacheHandle.__del__`` itself, gated by a new ``autosave_on_del`` flag. The flag is set by ownership context: * Engine-implicit handles (created from a path-string compile-time hint) get ``autosave_on_del=True`` -- no other Python object holds them, so the destructor is the only save opportunity. * The ``runtime_cache(target, path)`` CM uses ``autosave_on_del=False`` on the handle it constructs; its ``__exit__`` saves explicitly. * Hand-built handles default to ``autosave_on_del=False`` so save timing stays under the user's control. The handle additionally accepts a ``torchbind_handle`` sibling so the same Python object can wrap either a ``trt.IRuntimeCache`` (Python rt) or a ``torch.classes.tensorrt.RuntimeCacheHandle`` (cpp rt); ``save`` / ``load`` source bytes from whichever is populated. The cpp-runtime helpers on ``TorchTensorRTModule`` (``_load_cpp_implicit_cache``, ``_save_cpp_implicit_cache``, ``__del__``) and the duplicate save logic in ``_TRTEngine.close()`` are removed; both runtimes funnel through the single ``RuntimeCacheHandle.__del__`` path. Tests ----- test_000 grows two new tests asserting the new contract: * ``test_cm_does_not_double_save_on_rc_gc`` -- only one save fires per CM block even after ``rc`` is GC'd. * ``test_user_built_handle_no_autosave_by_default`` -- hand-built handles do not autosave on GC. All 51 runtime tests pass on the refactored design (test_004 12/12, test_000 12/12, test_001 ds 14/14, test_001 cg 13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Five follow-up changes responding to PR review comments: * **Fold strategy sugar into ``_runtime_config.py``.** Delete ``_dynamic_shapes_kernel_strategy.py`` and ``_cuda_graph_strategy.py``; ``set_dynamic_shapes_kernel_strategy`` / ``set_cuda_graph_strategy`` now live alongside the ``runtime_config`` CM they delegate to. ``torch_tensorrt/runtime/__init__.py`` re-exports them from the consolidated module. * **Hoist ``RuntimeSettings`` defaults into ``_defaults.py``.** Three new constants (``DYNAMIC_SHAPES_KERNEL_SPECIALIZATION_STRATEGY``, ``CUDA_GRAPH_STRATEGY``, ``RUNTIME_CACHE_PATH``) mirror the compilation-settings pattern. ``RUNTIME_CACHE_PATH`` defaults to a per-user temp file similar to ``ENGINE_CACHE_DIR``, so users get a disk-backed runtime cache without explicit opt-in; override via ``RuntimeSettings(runtime_cache="/path")`` or the ``runtime_cache`` CM. Test_000 and test_004 updated to reflect the new default. * **Warn on non-RTX ``RuntimeSettings`` construction.** ``__post_init__`` now emits a one-shot ``UserWarning`` on regular TRT builds (gated by ``ENABLED_FEATURES.tensorrt_rtx``) so users see that the settings have no effect. * **Drop ``TYPE_CHECKING`` string forward-refs for ``RuntimeSettings``.** Direct top-level imports across ``_compiler.py``, ``_conversion.py``, ``_TRTEngine.py`` and ``_TorchTensorRTModule.py``; bare ``Optional[RuntimeSettings]`` annotations everywhere. Deferred imports inside ``__init__`` / ``__setstate__`` removed. All 51 runtime tests pass (test_004 12/12, test_000 12/12, test_001 ds 14/14, test_001 cg 13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tp5uiuc · 2026-06-04T03:07:34Z

+        // Expose the underlying IRuntimeCache bytes to Python so the Python-
+        // side save/load logic can persist them under filelock. Returns an
+        // empty uint8 tensor if the cache hasn't been materialized yet.
+        //
+        // We return ``at::Tensor`` rather than ``std::string`` because TorchBind
+        // forces ``std::string`` to round-trip through Python ``str`` (UTF-8)
+        // and serialized cache bytes are not valid UTF-8.
+        .def(
+            "serialize",
+            [](const c10::intrusive_ptr<RuntimeCacheHandle>& self) -> at::Tensor {
+#ifdef TRT_MAJOR_RTX
+              auto opts = at::TensorOptions().dtype(at::kByte);
+              if (!self->cache) {
+                return at::empty({0}, opts);
+              }
+              auto host_mem = make_trt(self->cache->serialize());
+              if (!host_mem) {
+                return at::empty({0}, opts);
+              }


These methods "serialize", "deserialize", "has_cache" should be implemented elsewhere (in its cpp file). If the concern is introducing at:: symbols we can work around it. This file should only be used for registration.

Another smell is using #ifdef here, these should be hidden in the implementation.

Done in 111cdb2eb. serialize / deserialize / has_cache are now member functions of RuntimeCacheHandle declared in RuntimeSettings.h and implemented in RuntimeSettings.cpp. All #ifdef TRT_MAJOR_RTX lives inside those method bodies; the registration in register_jit_hooks.cpp is preprocessor-free for this class:

torch::class_<RuntimeCacheHandle>("tensorrt", "RuntimeCacheHandle") .def(torch::init<std::string>()) .def_readwrite("path", &RuntimeCacheHandle::path) .def("serialize", &RuntimeCacheHandle::serialize) .def("deserialize", &RuntimeCacheHandle::deserialize) .def("has_cache", &RuntimeCacheHandle::has_cache);

tp5uiuc · 2026-06-04T03:54:34Z

+            [](const c10::intrusive_ptr<TRTEngine>& self,
+               std::string const& dynamic_shapes_kernel_specialization_strategy,
+               std::string const& cuda_graph_strategy,
+               c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> runtime_cache) -> void {


Is it not possible to implement this as a property with getter and setter because of this c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> runtime_cache?

Possible — the c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> signature is fine for torchbind def_property (device_memory_budget immediately below in this same registration is a property on TRTEngine, for a comparable point).

The reason update_runtime_settings is a single bundled setter is that RuntimeSettings is the unit of context invalidation: changing any one of the three fields ends up calling recreate_execution_context once. Splitting into three individual properties would cause three sequential context-recreates on the engine-setup path (where all three are set together via _dispatch_runtime_settings_to_engine). The diff-check inside TRTRuntimeConfig::set_settings would catch no-op repeats, but consecutive changing writes would each trigger a recreate.

If you would rather have property syntax I can split it, but the bundled form keeps setup tight. WDYT?

Mirror ``TRTRuntimeConfig.set_settings`` (Python runtime) on the cpp runtime path. Previously the cpp side dropped the C++ engine's intrusive_ptr on settings change but left ``self._implicit_cache_handle`` on the ``TorchTensorRTModule`` pointing at the *old* wrapper -- the new cache had no Python autosave companion and never wrote to disk. Factor the path-string-to-torchbind-handle materialization into ``TorchTensorRTModule._materialize_cpp_implicit_handle``. Called from ``setup_engine`` and ``_dispatch_runtime_settings_to_engine`` (cpp branch); synchronously saves the prior wrapper before swap, replaces ``self._implicit_cache_handle`` with the new one, then runs ``load()`` after the C++ engine has attached the IRuntimeCache. Test: ``test_set_runtime_settings_saves_prior_cache_on_swap`` (parametrized over both runtimes). Compiles with path A; swaps to path B; asserts A is written synchronously at swap time and B is written on ``del compiled``. The walk-to-inner-module is wrapped in a helper so the loop variable doesn't outlive the call and keep the inner TRT module alive past ``del compiled`` (which would suppress the post-del autosave). All 53 tests pass (test_004 12/12, test_000 14/14, test_001 ds 14/14, test_001 cg 13/13).

tp5uiuc · 2026-06-04T03:55:11Z

+  return dynamic_shapes_kernel_specialization_strategy == other.dynamic_shapes_kernel_specialization_strategy &&
+      cuda_graph_strategy == other.cuda_graph_strategy && runtime_cache.get() == other.runtime_cache.get();


Can't we use std::tie(a,b,c) == std::tie(other.a, other.b, other.c)?

Done in 111cdb2eb. Note: runtime_cache.get() returns a temporary T* so the raw pointers are hoisted into locals first to satisfy std::ties lvalue requirement.

auto* this_cache = runtime_cache.get(); auto* other_cache = other.runtime_cache.get(); return std::tie(dynamic_shapes_kernel_specialization_strategy, cuda_graph_strategy, this_cache) == std::tie(other.dynamic_shapes_kernel_specialization_strategy, other.cuda_graph_strategy, other_cache);

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tp5uiuc · 2026-06-04T03:56:18Z

+  RuntimeSettings result = *this;
+  result.dynamic_shapes_kernel_specialization_strategy = override.dynamic_shapes_kernel_specialization_strategy;
+  result.cuda_graph_strategy = override.cuda_graph_strategy;
+  if (override.runtime_cache) {
+    result.runtime_cache = override.runtime_cache;
+  }
+  return result;
+}


I don't like this at all. An easier way of doing this is to define a swap method and implement it that way (similar to copy-and-swap idiom) for example

Dropped merge() entirely in 111cdb2eb (see also your follow-up comment below making the same point about member-wise assignment). Direct field assignment is the idiom now — no merge() helper.

tp5uiuc · 2026-06-04T03:57:13Z

+        .def("path", &RuntimeCacheHandle::path)
+        .def("set_path", &RuntimeCacheHandle::set_path)


path can be a property.

Done in 111cdb2eb. path is now a public field on the (now struct) RuntimeCacheHandle and registered as def_readwrite("path", &RuntimeCacheHandle::path). The old path() getter and set_path() setter methods are gone.

tp5uiuc · 2026-06-04T03:58:13Z

+ private:
+  std::string path_;


What does making path private get us at the end of day? its not like if path changes the runtime cache immediately changes. better to make it public and replace the entire class RuntimeCacheHandle with a struct.

Done in 111cdb2eb — RuntimeCacheHandle is now a struct with a public path field. The cache shared_ptr is also a public field. Agreed there were no encapsulation invariants worth preserving.

tp5uiuc · 2026-06-04T04:01:27Z

+  // For non-default detection on the strategy strings we always overlay; the cache
+  // pointer is overlaid iff `override.runtime_cache` is non-null.
+  RuntimeSettings merge(RuntimeSettings const& override) const;
+


what's the purpose of this again? Can't we do

auto another_runtime_settings = old.runtime_settings; another_runtime_settings.dynamic_shapes_kernel_specialization_strategy = "something_else"; old.runtime_settings = another_runtime_settings; // or std::swap(old.runtime_settings, another_runtime_settings);

instead? Why add anothr function if we are going to member-wise swap anyway?

Dropped merge() in 111cdb2. Direct field assignment / std::swap on the struct is exactly the right pattern — no helper required. (For context: Python RuntimeSettings is @dataclass(frozen=True), so on the Python side RuntimeSettings.merge(**kwargs) exists as the dataclass-replace idiom; that is unrelated and stays.)

tp5uiuc · 2026-06-04T04:03:18Z

    const std::string& serialized_metadata,
    const ResourceAllocationStrategy resource_allocation_strategy,
-    TRTRuntimeConfig runtime_cfg)
+    RuntimeSettings runtime_settings)


Will this constructor be ever called from the Python side so that Runtimesettings get passed in and context gets created only once?

Today: no. This is the "first" constructor overload (the public one that accepts a RuntimeSettings directly) — but the only torchbind init is the vector<string> one at L87 of register_jit_hooks.cpp, which goes through the deserialization constructor at L114 of TRTEngine.cpp, passing RuntimeSettings{} (defaults). The Python side then calls engine.update_runtime_settings(...) right after construction, which recreates the context once. So construction → recreate-1 (with defaults) → settings dispatch → recreate-2 (with the real settings).

To get it down to one recreate, I would add a second torch::init<...> overload that accepts the strategy strings + the cache handle alongside serialized_info, route through this RuntimeSettings-aware constructor, and skip the post-construction dispatch in Python. Doable in this PR if you want — but the savings are small (one createExecutionContext per engine) and the dispatch path stays as-is for live runtime_config CM changes anyway.

tp5uiuc · 2026-06-04T04:07:31Z

+  std::string dynamic_shapes_kernel_specialization_strategy = "lazy";
+  std::string cuda_graph_strategy = "disabled";


Do we need to store strings? Can't we store ints here since they will be converted to enumerations anyway? This way we just static cast them to get the enumeration

Tradeoff:

Ints: type-safe; static_cast<nvinfer1::DynamicShapesKernelSpecializationStrategy>(settings_.foo) is a no-op; one validation layer.

Strings: human-readable to_str() (logged via operator<<) is useful for support reports / debug logs; identical Python-and-C++ value-form for the canonical setting; reasonably grep-able through the codebase.

Today the layout is "strings everywhere, two validation points": Python RuntimeSettings.__post_init__ validates against the string→int map, then TRTRuntimeConfig::ensure_initialized re-validates inside to_trt_ds_strategy / to_trt_cg_strategy. Moving to int internally means: keep the Python-facing API as strings, convert once at the torchbind boundary (update_runtime_settings lambda), store ints, skip the second validation. The cost is that to_str() needs a reverse-lookup table to print "lazy" / "whole_graph_capture".

Happy to push that through if you prefer ints — I held off here because the current form keeps logs human-readable on both sides of the boundary. WDYT?

…Handle C++-side cleanup spurred by review comments on #3: - Convert ``RuntimeCacheHandle`` from a class with a private ``path_`` field + accessor methods (``path()`` / ``set_path()``) to a struct with a public ``path`` field. Re-register the torchbind binding via ``.def_readwrite("path", &RuntimeCacheHandle::path)``. - Move the bodies of ``serialize``, ``deserialize``, and ``has_cache`` out of the JIT-binding registration file (``register_jit_hooks.cpp``) and into member functions implemented in ``RuntimeSettings.cpp``. The ``#ifdef TRT_MAJOR_RTX`` guards live inside those impls; the registration file is preprocessor-free for these bindings. - Use ``std::tie`` in ``RuntimeSettings::operator==`` for cleaner field-wise comparison (raw ``intrusive_ptr::get()`` results hoisted to lvalues to satisfy ``std::tie``'s reference requirement). - Drop ``RuntimeSettings::merge``. C++ ``RuntimeSettings`` is now value-typed end-to-end; direct field assignment is the idiom. No callers used ``merge`` outside its own definition. No behavior change. Python-side ``RuntimeCacheHandle`` wrapper and the runtime test suite are unaffected.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions Bot added component: api [Python] component: core component: dynamo component: runtime component: tests component: conversion labels Jun 3, 2026

github-actions Bot approved these changes Jun 3, 2026

View reviewed changes

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_cuda_graph_strategy.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_dynamic_shapes_kernel_strategy.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_runtime_settings.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/conversion/_conversion.py Outdated

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc marked this pull request as draft June 4, 2026 02:22

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc commented Jun 4, 2026

View reviewed changes

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc commented Jun 4, 2026

View reviewed changes

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

		return dynamic_shapes_kernel_specialization_strategy == other.dynamic_shapes_kernel_specialization_strategy &&
		cuda_graph_strategy == other.cuda_graph_strategy && runtime_cache.get() == other.runtime_cache.get();

		.def("path", &RuntimeCacheHandle::path)
		.def("set_path", &RuntimeCacheHandle::set_path)

		std::string dynamic_shapes_kernel_specialization_strategy = "lazy";
		std::string cuda_graph_strategy = "disabled";

Conversation

tp5uiuc commented Jun 3, 2026

Summary

Tests

Status

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 4, 2026 •

edited

Loading