-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[WIP] Add warp environment docs and timer alignment #4995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hujc7
wants to merge
7
commits into
isaac-sim:develop
Choose a base branch
from
hujc7:jichuanh/warp-env-docs
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+16,824
−159
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
98d5fc3
Add warp manager-based env infrastructure
hujc7 5108ca8
Add warp Cartpole task configuration
hujc7 b281975
Add warp MDP terms and manager infrastructure updates
hujc7 dc43249
Add capture safety guards and fix WrenchComposer stale COM pose
hujc7 2e3f6ad
Add warp env configs and task-specific MDP terms
hujc7 a9ece43
Add warp MDP parity tests
hujc7 7ee1f43
Add warp environment docs and align step timer across all env variants
hujc7 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
283 changes: 283 additions & 0 deletions
283
.../source/experimental-features/newton-physics-integration/warp-env-migration.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,283 @@ | ||
| .. _warp-env-migration: | ||
|
|
||
| Warp Environment Guide | ||
| ====================== | ||
|
|
||
| This guide covers the key conventions and patterns used by the warp-first environment | ||
| infrastructure, useful for migrating existing stable environments or creating new ones natively. | ||
|
|
||
| .. note:: | ||
|
|
||
| The warp environment infrastructure lives in ``isaaclab_experimental`` and | ||
| ``isaaclab_tasks_experimental``. It's an experimental feature. | ||
|
|
||
|
|
||
| Design Rationale | ||
| ~~~~~~~~~~~~~~~~ | ||
|
|
||
| The warp environment path is built around `CUDA graph capture | ||
| <https://docs.nvidia.com/cuda/cuda-programming-guide/04-special-topics/cuda-graphs.html>`_. | ||
| A CUDA graph records a sequence of GPU operations (kernel launches, memory copies) during a | ||
| capture phase, then replays the entire sequence with a single launch. This eliminates per-kernel | ||
| CPU overhead — the parameter validation, kernel selection, and buffer setup that normally costs | ||
| 20–200 μs per operation is performed once during graph instantiation and reused on every replay | ||
| (~10 μs total). All CPU-side code (Python logic, torch dispatching) executed during capture is | ||
| completely bypassed during replay. See the `Warp concurrency documentation | ||
| <https://nvidia.github.io/warp/deep_dive/concurrency.html>`_ for Warp's graph capture API | ||
| (``wp.ScopedCapture``). | ||
|
|
||
| All design decisions in the warp infrastructure follow from this constraint: every operation in the | ||
| step loop must be a GPU kernel launch with stable memory pointers so that the captured graph can | ||
| be replayed without modification. | ||
|
|
||
| Key consequences: | ||
|
|
||
| - All buffers are **pre-allocated** — no dynamic allocation inside the step loop | ||
| - Data flows through **persistent ``wp.array`` pointers** — never replaced, only overwritten | ||
| - MDP terms are **pure ``@wp.kernel`` functions** — no Python branching on GPU data | ||
| - Reset uses **boolean masks** (``env_mask``) instead of index lists (``env_ids``) to avoid | ||
| variable-length indexing that changes graph topology | ||
|
|
||
|
|
||
| Project Structure | ||
| ~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Warp-specific implementations that deviate from stable live in the ``_experimental`` packages: | ||
|
|
||
| - ``isaaclab_experimental`` — warp managers, base env classes, warp MDP terms | ||
| - ``isaaclab_tasks_experimental`` — warp task configs and task-specific MDP terms | ||
|
|
||
| Any new warp implementation that differs from the stable API belongs in these packages. | ||
| Warp task configs reference Newton physics directly (no ``PresetCfg``) since the warp path | ||
| is Newton-only. | ||
|
|
||
|
|
||
| Writing Warp MDP Terms | ||
| ~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Imports | ||
| ^^^^^^^ | ||
|
|
||
| Warp task configs import from the experimental packages: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| # Warp | ||
| from isaaclab_experimental.managers import ObservationTermCfg, RewardTermCfg, SceneEntityCfg | ||
| import isaaclab_experimental.envs.mdp as mdp | ||
|
|
||
| The term config classes have the same interface — only the import path changes. | ||
|
|
||
|
|
||
| Common Pattern | ||
| ^^^^^^^^^^^^^^ | ||
|
|
||
| All warp MDP terms (observations, rewards, terminations, events, actions) follow the same | ||
| **kernel + launch** pattern. Stable terms use torch tensors and return results; warp terms | ||
| write into pre-allocated ``wp.array`` output buffers via ``@wp.kernel`` functions: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| # Stable — returns a tensor | ||
| def lin_vel_z_l2(env, asset_cfg) -> torch.Tensor: | ||
| return torch.square(asset.data.root_lin_vel_b[:, 2]) | ||
|
|
||
| # Warp — writes into pre-allocated output | ||
| @wp.kernel | ||
| def _lin_vel_z_l2_kernel(vel: wp.array(...), out: wp.array(dtype=wp.float32)): | ||
| i = wp.tid() | ||
| out[i] = vel[i][2] * vel[i][2] | ||
|
|
||
| def lin_vel_z_l2(env, out, asset_cfg) -> None: | ||
| wp.launch(_lin_vel_z_l2_kernel, dim=env.num_envs, inputs=[..., out]) | ||
|
|
||
| The output buffer shapes differ by term type: | ||
|
|
||
| - **Observations**: ``(num_envs, D)`` where D is the observation dimension | ||
| - **Rewards**: ``(num_envs,)`` | ||
| - **Terminations**: ``(num_envs,)`` with dtype ``bool`` | ||
| - **Events**: ``(num_envs,)`` mask — events don't produce output, they modify sim state | ||
|
|
||
|
|
||
| Observation Terms | ||
| ^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Since warp terms write into pre-allocated buffers, the observation manager must know each | ||
| term's output dimension at initialization to allocate the correct ``(num_envs, D)`` output | ||
| array. This is resolved via a fallback chain (see | ||
| ``ObservationManager._infer_term_dim_scalar`` in | ||
| ``isaaclab_experimental/managers/observation_manager.py``): | ||
|
|
||
| 1. **Explicit ``out_dim`` in decorator** (preferred): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| @generic_io_descriptor_warp(out_dim=3, observation_type="RootState") | ||
| def base_lin_vel(env, out, asset_cfg) -> None: ... | ||
|
|
||
| ``out_dim`` can be an integer, or a string that resolves at initialization: | ||
|
|
||
| - ``"joint"`` — number of selected joints from ``asset_cfg`` | ||
| - ``"body:N"`` — N components per selected body from ``asset_cfg`` | ||
| - ``"command"`` — dimension from command manager | ||
| - ``"action"`` — dimension from action manager | ||
|
|
||
| 2. **``axes`` metadata**: Dimension equals the number of axes listed: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| @generic_io_descriptor_warp(axes=["X", "Y", "Z"], observation_type="RootState") | ||
| def projected_gravity(env, out, asset_cfg) -> None: ... | ||
| # → dimension = 3 | ||
|
|
||
| 3. **Legacy params**: ``term_dim``, ``out_dim``, or ``obs_dim`` keys in ``term_cfg.params``. | ||
|
|
||
| 4. **Asset config fallback**: Count of ``asset_cfg.joint_ids`` (or ``joint_ids_wp``) for | ||
| joint-level terms. | ||
|
|
||
|
|
||
| Event Terms | ||
| ^^^^^^^^^^^ | ||
|
|
||
| Events use ``env_mask`` (boolean ``wp.array``) instead of ``env_ids``, and each kernel | ||
| checks the mask to skip non-selected environments: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| def reset_joints_by_offset(env, env_mask, ...): | ||
| wp.launch(_kernel, dim=env.num_envs, inputs=[env_mask, ...]) | ||
|
|
||
| @wp.kernel | ||
| def _kernel(env_mask: wp.array(dtype=wp.bool), ...): | ||
| i = wp.tid() | ||
| if not env_mask[i]: | ||
| return | ||
| # ... modify state for selected envs only | ||
|
|
||
| - RNG uses per-env ``env.rng_state_wp`` (``wp.uint32``) instead of ``torch.rand`` | ||
| - **Startup/prestartup** events use the stable convention ``(env, env_ids, **params)`` | ||
| - **Reset/interval** events use the warp convention ``(env, env_mask, **params)`` | ||
|
|
||
|
|
||
| Action Terms | ||
| ^^^^^^^^^^^^ | ||
|
|
||
| Actions follow a **two-stage execution**: ``process_actions`` (called once per env step) scales | ||
| and clips raw actions, and ``apply_actions`` (called once per sim step) writes targets to the | ||
| asset. Both stages use warp kernels with pre-allocated ``_raw_actions`` and ``_processed_actions`` | ||
| buffers. | ||
|
|
||
|
|
||
| Capture Safety | ||
| ^^^^^^^^^^^^^^ | ||
|
|
||
| When writing terms that run inside the captured step loop, keep in mind: | ||
|
|
||
| - **No ``wp.to_torch``** or torch arithmetic — stay in warp throughout | ||
| - **No lazy-evaluated properties** — use sim-bound (Tier 1) data directly; if a derived | ||
| quantity is needed, compute it inline in the kernel | ||
| - **No dynamic allocation** — all buffers must be pre-allocated in ``__init__`` | ||
|
|
||
|
|
||
| Parity Testing | ||
| ~~~~~~~~~~~~~~ | ||
|
|
||
| Two levels of parity testing are used to validate warp terms: | ||
|
|
||
| **1. Implementation parity (stable vs warp)** — verifies that the warp kernel produces the | ||
| same result as the stable torch implementation. This is optional for terms that have no stable | ||
| counterpart (e.g. new terms written directly in warp). | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| import isaaclab.envs.mdp.observations as stable_obs | ||
| import isaaclab_experimental.envs.mdp.observations as warp_obs | ||
|
|
||
| # Stable baseline | ||
| expected = stable_obs.joint_pos(stable_env, asset_cfg=cfg) | ||
|
|
||
| # Warp (uncaptured) | ||
| out = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) | ||
| warp_obs.joint_pos(warp_env, out, asset_cfg=cfg) | ||
| actual = wp.to_torch(out) | ||
|
|
||
| torch.testing.assert_close(actual, expected) | ||
|
|
||
| **2. Capture parity (warp vs warp-captured)** — verifies that the term produces identical | ||
| results when replayed from a CUDA graph vs launched directly. A mismatch here indicates capture-unsafe | ||
| code (e.g. stale pointers, dynamic allocation, or lazy property access that doesn't replay). | ||
| This test should always be run, even for terms without a stable counterpart. | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| # Warp uncaptured | ||
| out_uncaptured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) | ||
| warp_obs.joint_pos(warp_env, out_uncaptured, asset_cfg=cfg) | ||
|
|
||
| # Warp captured (graph replay) | ||
| out_captured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) | ||
| with wp.ScopedCapture() as cap: | ||
| warp_obs.joint_pos(warp_env, out_captured, asset_cfg=cfg) | ||
| wp.capture_launch(cap.graph) | ||
|
|
||
| torch.testing.assert_close(wp.to_torch(out_captured), wp.to_torch(out_uncaptured)) | ||
|
|
||
| See ``source/isaaclab_experimental/test/envs/mdp/`` for complete parity test examples. | ||
|
|
||
|
|
||
| Available Warp MDP Terms | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| .. list-table:: | ||
| :header-rows: 1 | ||
| :widths: 20 80 | ||
|
|
||
| * - Category | ||
| - Available Terms | ||
| * - Observations (11) | ||
| - | ``base_pos_z`` | ||
| | ``base_lin_vel`` | ||
| | ``base_ang_vel`` | ||
| | ``projected_gravity`` | ||
| | ``joint_pos`` | ||
| | ``joint_pos_rel`` | ||
| | ``joint_pos_limit_normalized`` | ||
| | ``joint_vel`` | ||
| | ``joint_vel_rel`` | ||
| | ``last_action`` | ||
| | ``generated_commands`` | ||
| * - Rewards (16) | ||
| - | ``is_alive`` | ||
| | ``is_terminated`` | ||
| | ``lin_vel_z_l2`` | ||
| | ``ang_vel_xy_l2`` | ||
| | ``flat_orientation_l2`` | ||
| | ``joint_torques_l2`` | ||
| | ``joint_vel_l1`` | ||
| | ``joint_vel_l2`` | ||
| | ``joint_acc_l2`` | ||
| | ``joint_deviation_l1`` | ||
| | ``joint_pos_limits`` | ||
| | ``action_rate_l2`` | ||
| | ``action_l2`` | ||
| | ``undesired_contacts`` | ||
| | ``track_lin_vel_xy_exp`` | ||
| | ``track_ang_vel_z_exp`` | ||
| * - Events (6) | ||
| - | ``reset_joints_by_offset`` | ||
| | ``reset_joints_by_scale`` | ||
| | ``reset_root_state_uniform`` | ||
| | ``push_by_setting_velocity`` | ||
| | ``apply_external_force_torque`` | ||
| | ``randomize_rigid_body_com`` | ||
| * - Terminations (4) | ||
| - | ``time_out`` | ||
| | ``root_height_below_minimum`` | ||
| | ``joint_pos_out_of_manual_limit`` | ||
| | ``illegal_contact`` | ||
| * - Actions (2) | ||
| - | ``JointPositionAction`` | ||
| | ``JointEffortAction`` | ||
|
|
||
| Terms not listed here remain in stable only. When using an env that requires unlisted terms, | ||
| those terms must be implemented in warp first. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing RST files referenced in toctree
The toctree in this file references
training-environmentsandvisualization, but neithertraining-environments.rstnorvisualization.rstexists in thenewton-physics-integration/directory (confirmed viagit ls-treeat the PR's head SHA).The directory contains only:
installation.rstwarp-environments.rst✅ (added by this PR)warp-env-migration.rst✅ (added by this PR)limitations-and-known-bugs.rstsolver-transitioning.rstsim-to-sim.rstsim-to-real.rstSphinx will fail with a
WARNING: document not found: training-environments(andvisualization) error during the documentation build, which typically promotes to a build error in CI. Either create the missing stub files or remove these entries from the toctree until the pages are ready.