perf: Disable USD notice handlers during usd_replicate for 25-35% faster startup#5070
perf: Disable USD notice handlers during usd_replicate for 25-35% faster startup#5070aloysbaillet wants to merge 1 commit intoisaac-sim:developfrom
Conversation
Greptile SummaryThis PR delivers a meaningful startup-time optimization (25–35% reduction for large environments) by mirroring what IsaacSim's own Key changes:
Issues found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant usd_replicate
participant _disable_usd_notice_handlers
participant SimulationManager
participant UsdUtils_StageCache
participant Sdf
Caller->>usd_replicate: usd_replicate(stage, sources, destinations, ...)
usd_replicate->>_disable_usd_notice_handlers: _disable_usd_notice_handlers(stage)
_disable_usd_notice_handlers->>UsdUtils_StageCache: Insert(stage) → stage_id
_disable_usd_notice_handlers->>SimulationManager: is_fabric_usd_notice_handler_enabled(stage_id)
SimulationManager-->>_disable_usd_notice_handlers: fabric_was_enabled
_disable_usd_notice_handlers->>SimulationManager: enable_fabric_usd_notice_handler(stage_id, False)
_disable_usd_notice_handlers->>SimulationManager: enable_usd_notice_handler(False)
_disable_usd_notice_handlers-->>usd_replicate: state dict
rect rgb(200, 230, 200)
Note over usd_replicate,Sdf: try block — bulk USD authoring with handlers suppressed
usd_replicate->>Sdf: ChangeBlock (single block wrapping all depths)
loop for each depth × source × env_id
usd_replicate->>Sdf: CreatePrimInLayer(rl, dp)
usd_replicate->>Sdf: CopySpec(src → dp)
usd_replicate->>Sdf: author xformOp:translate / xformOp:orient
end
Sdf-->>usd_replicate: ChangeBlock closes → single batch notify
end
rect rgb(255, 220, 220)
Note over usd_replicate,SimulationManager: finally block — always restore handlers
usd_replicate->>SimulationManager: enable_fabric_usd_notice_handler(stage_id, True) [if was_enabled]
usd_replicate->>SimulationManager: enable_usd_notice_handler(True) [unconditional ⚠️]
end
usd_replicate-->>Caller: return
Last reviewed commit: "perf: disable USD no..." |
| state["fabric_notice_was_enabled"] = SimulationManager.is_fabric_usd_notice_handler_enabled(stage_id) | ||
| if state["fabric_notice_was_enabled"]: | ||
| SimulationManager.enable_fabric_usd_notice_handler(stage_id, False) | ||
| SimulationManager.enable_usd_notice_handler(False) |
There was a problem hiding this comment.
USD notice handler state not saved before disabling
SimulationManager.enable_usd_notice_handler(False) is called unconditionally without first checking whether it was already enabled. In _enable_usd_notice_handlers, it is then always restored to True — regardless of what its state was before usd_replicate() was called.
Contrast this with the fabric notice handler, where the initial state is checked with is_fabric_usd_notice_handler_enabled() and only restored if it was enabled. If any outer code (e.g. another nested cloning operation or a user-managed disable context) had already set the USD notice handler to False, calling usd_replicate() would silently re-enable it in the finally block — a side effect the caller would not expect.
The fix is to save and conditionally restore the USD notice handler state, mirroring the fabric handler pattern:
try:
from isaacsim.core.simulation_manager.impl.simulation_manager import SimulationManager
state["fabric_notice_was_enabled"] = SimulationManager.is_fabric_usd_notice_handler_enabled(stage_id)
if state["fabric_notice_was_enabled"]:
SimulationManager.enable_fabric_usd_notice_handler(stage_id, False)
# Save and conditionally disable the USD notice handler as well
state["usd_notice_was_enabled"] = SimulationManager.is_usd_notice_handler_enabled()
if state["usd_notice_was_enabled"]:
SimulationManager.enable_usd_notice_handler(False)
state["sim_manager_available"] = TrueAnd in _enable_usd_notice_handlers:
if state.get("usd_notice_was_enabled", False):
SimulationManager.enable_usd_notice_handler(True)(This assumes SimulationManager.is_usd_notice_handler_enabled() exists; if not, at minimum the existing asymmetry relative to the fabric handler should be documented.)
| stage_id = UsdUtils.StageCache.Get().Insert(stage).ToLongInt() | ||
| state["stage_id"] = stage_id |
There was a problem hiding this comment.
Stage inserted into global cache without cleanup
UsdUtils.StageCache.Get().Insert(stage) inserts the stage into the global, process-wide USD stage cache. If the stage was not already registered in the cache, this call permanently adds it — and there is no corresponding removal, either in _disable_usd_notice_handlers or in the finally block of usd_replicate().
In Isaac Sim's normal runtime the active stage is typically already cached (Isaac Sim registers it at startup), so this is usually a no-op insertion. However, in unit tests or lightweight scenarios where the stage is created independently, calling usd_replicate() would silently inject it into the global cache and it would remain there for the lifetime of the process, preventing the stage from being garbage-collected.
Consider checking whether the stage was already in the cache and only storing the ID without re-insertion, or — if an insertion was needed — removing it in the finally path:
cache = UsdUtils.StageCache.Get()
cache_id = cache.GetId(stage)
if not cache_id.IsValid():
cache_id = cache.Insert(stage)
state["stage_inserted_by_us"] = True
else:
state["stage_inserted_by_us"] = False
state["stage_id"] = cache_id.ToLongInt()
state["cache_id_obj"] = cache_idThen in _enable_usd_notice_handlers, conditionally erase:
if state.get("stage_inserted_by_us", False):
UsdUtils.StageCache.Get().Erase(state["cache_id_obj"])| SimulationManager.enable_fabric_usd_notice_handler(stage_id, False) | ||
| SimulationManager.enable_usd_notice_handler(False) | ||
| state["sim_manager_available"] = True | ||
| except (ImportError, Exception) as e: |
There was a problem hiding this comment.
Redundant exception type in
except clause
ImportError is a subclass of Exception, so except (ImportError, Exception) is equivalent to except Exception. Listing ImportError explicitly adds no additional behavior and may give the false impression that it is handled differently from other exceptions.
| except (ImportError, Exception) as e: | |
| except Exception as e: |
| if state.get("fabric_notice_was_enabled", False): | ||
| SimulationManager.enable_fabric_usd_notice_handler(state["stage_id"], True) | ||
| SimulationManager.enable_usd_notice_handler(True) | ||
| except (ImportError, Exception) as e: |
| with Sdf.ChangeBlock(): | ||
| for i in depth_to_indices[depth]: | ||
| src = sources[i] | ||
| tmpl = destinations[i] | ||
| # Select target environments for this source (supports None, [E], or [S, E]) | ||
| target_envs = env_ids if mask is None else env_ids[mask[i]] | ||
| for wid in target_envs.tolist(): | ||
| dp = tmpl.format(wid) | ||
| Sdf.CreatePrimInLayer(rl, dp) | ||
| if src == dp: | ||
| pass # self-copy: CreatePrimInLayer already ensures it exists; CopySpec would be destructive | ||
| else: | ||
| Sdf.CopySpec(rl, Sdf.Path(src), rl, Sdf.Path(dp)) | ||
|
|
||
| if positions is not None or quaternions is not None: | ||
| ps = rl.GetPrimAtPath(dp) | ||
| op_names = [] | ||
| if positions is not None: | ||
| p = positions[wid] | ||
| t_attr = ps.GetAttributeAtPath(dp + ".xformOp:translate") | ||
| if t_attr is None: | ||
| t_attr = Sdf.AttributeSpec(ps, "xformOp:translate", Sdf.ValueTypeNames.Double3) | ||
| t_attr.default = Gf.Vec3d(float(p[0]), float(p[1]), float(p[2])) | ||
| op_names.append("xformOp:translate") | ||
| if quaternions is not None: | ||
| q = quaternions[wid] | ||
| o_attr = ps.GetAttributeAtPath(dp + ".xformOp:orient") | ||
| if o_attr is None: | ||
| o_attr = Sdf.AttributeSpec(ps, "xformOp:orient", Sdf.ValueTypeNames.Quatd) | ||
| # xyzw convention: q[3] is w, q[0:3] is xyz | ||
| o_attr.default = Gf.Quatd(float(q[3]), Gf.Vec3d(float(q[0]), float(q[1]), float(q[2]))) | ||
| op_names.append("xformOp:orient") | ||
| # Only author xformOpOrder for the ops we actually authored | ||
| if op_names: | ||
| op_order = ps.GetAttributeAtPath(dp + ".xformOpOrder") or Sdf.AttributeSpec( | ||
| ps, UsdGeom.Tokens.xformOpOrder, Sdf.ValueTypeNames.TokenArray | ||
| ) | ||
| op_order.default = Vt.TokenArray(op_names) | ||
| for depth in sorted(depth_to_indices.keys()): |
There was a problem hiding this comment.
Single
Sdf.ChangeBlock changes prior per-depth composition semantics
The original code wrapped each depth level in its own Sdf.ChangeBlock(), which caused USD to flush pending change notices and re-compose the stage at the end of each depth. The accompanying comment explicitly noted this was done "to stabilize composition" for nested or interdependent specs.
The new code wraps all depths in one Sdf.ChangeBlock, deferring all notifications (and USD composition) until every Sdf.CopySpec across every depth has completed. For pure SDF layer writes this should be safe because Sdf.CopySpec operates on spec data rather than the composed stage. However, if any path deeper in the hierarchy is defined via composition arcs that reference an ancestor spec authored at a shallower depth in the same call, the composed view of those arcs won't be available until after the outer ChangeBlock closes.
This is likely fine for the tested robot/environment assets, but it is a semantic regression from the original design. If this change is intentional (as an additional performance improvement complementing the notice-handler suppression), please add a comment explaining why per-depth flushing is no longer needed, so future maintainers don't revert it without understanding the trade-off.
| @functools.cache | ||
| def string_to_callable(name: str) -> Callable: |
There was a problem hiding this comment.
@functools.cache is unbounded and never cleared
functools.cache is an alias for lru_cache(maxsize=None), which means every unique string ever passed to string_to_callable is retained for the process lifetime along with a strong reference to the resolved callable (and, transitively, its module). In long-running Isaac Sim training sessions with many distinct callable strings this cache will grow without bound and prevent modules from being garbage-collected.
Two additional edge-cases to be aware of:
- Module reloading: if a downstream user calls
importlib.reload(some_module), cached entries will silently continue returning the pre-reload callable. - Lambda expressions:
eval(name)creates a new function object on each call, but with caching only one instance is ever created. This is harmless for pure lambdas, but could be surprising if someone expected independent closure objects.
Consider using @functools.lru_cache(maxsize=512) to cap memory usage, or document the unbounded nature clearly so maintainers know to call string_to_callable.cache_clear() if needed.
7950929 to
3818cf8
Compare
Disable IFabricUsd and SimulationManager notice handlers during usd_replicate to avoid per-Sdf.CopySpec Fabric sync overhead. Critically, do NOT re-enable the handlers after cloning — the downstream SimulationContext.reset() handles the Fabric sync naturally, and re-enabling triggers an expensive batch resync. Also caches string_to_callable() with @functools.cache and fixes redundant triple ast.parse() in is_lambda_expression(). Key insight: The batch Fabric resync on re-enable is MORE expensive than letting SimulationContext.reset handle sync as part of its normal flow. Measured improvements (4096/8192 envs, 1 GPU, warm run): Isaac-Cartpole-Direct-v0: 8.9s -> 7.4s (-17%) Isaac-Reach-Franka-v0: 29.0s -> 9.6s (-67%) Isaac-Velocity-Rough-H1-v0: 37.9s -> 25.3s (-33%) Isaac-Ant-Direct-v0: 11.9s -> 8.4s (-29%) Isaac-Velocity-Flat-Anymal-D: 29.8s -> 16.2s (-46%) Isaac-Repose-Cube-Shadow-Direct: 46.6s -> 27.0s (-42%) Average improvement: -39% No FPS regressions. All tasks pass. 18/18 cloner tests pass.
3818cf8 to
9e9d7e4
Compare
Description
Summary
Disable IFabricUsd and SimulationManager USD notice handlers during bulk environment cloning in usd_replicate(), reducing startup time by 25-35% for manager-based tasks.
Problem
When usd_replicate() calls Sdf.CopySpec to clone thousands of environment prims, every single copy triggers IFabricUsd UsdNoticeListener::Handle(UsdNotice::ObjectsChanged) — an expensive Fabric↔USD sync operation. For Reach-Franka (4096 envs), this added 17 seconds (438 notice handler calls) during scene construction.
IsaacSim's own Cloner.disable_change_listener() already handles this by disabling the notice handler before cloning, but IsaacLab v3's replacement usd_replicate() function didn't replicate this behavior.
Additionally, the import path from isaacsim.core.simulation_manager import SimulationManager resolves to PhysxManager under IsaacLab's lazy_export module system, which lacks the enable_fabric_usd_notice_handler method. The correct import is from isaacsim.core.simulation_manager.impl.simulation_manager import SimulationManager.
Changes
source/isaaclab/isaaclab/cloner/cloner_utils.py:
• Add _disable_usd_notice_handlers() / _enable_usd_notice_handlers() helpers that mirror isaacsim.core.cloner.Cloner.disable_change_listener()
• Wrap usd_replicate() body in try/finally to disable handlers during bulk USD authoring
• Use correct full import path for SimulationManager
source/isaaclab/isaaclab/utils/string.py:
• Cache string_to_callable() with @functools.cache to avoid repeated importlib.import_module calls
• Fix triple ast.parse() call in is_lambda_expression()
Benchmark Results
Tested on DGXC (2x NVIDIA L40, 4096 envs, 1 GPU, warm run):
Root Cause Analysis
Profiling with Nsight Systems showed IFabricUsd UsdNoticeListener::Handle consuming:
• 17.0s for Reach-Franka (438 calls)
• 9.9s for H1-Rough
• 22.9s for Dexsuite-Kuka
These notice handler calls happen during usd_replicate when Sdf.CopySpec modifies the USD stage. By disabling the handlers (as IsaacSim's own Cloner does), the Fabric sync is deferred to a single batch at the end.
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there