Notes and setup guides for Texas Robotics use of TACC Stampede3.
Texas Robotics contributed 8 nodes to Stampede3 and has priority access to them via the amd-rtx queue.
| Spec | Value |
|---|---|
| Nodes | 8 |
| CPU | 2x AMD EPYC 9555 64-Core (128 cores per node) |
| RAM | ~1.5 TB per node |
| GPUs | 8x NVIDIA RTX PRO 6000 Blackwell Server Edition per node |
| GPU Driver | 590.48.01 |
| CUDA | 13.1 (via module load nvidia) |
| OS | Rocky Linux 9.7, Kernel 5.14.0 |
Texas Robotics allocations also have access to the rest of Stampede3. See the full queue documentation for details.
| Queue | Nodes | Cores/Node | RAM/Node | GPUs | Description |
|---|---|---|---|---|---|
skx-dev |
72 | 48 | 192 GB | -- | Dev/debug queue (2 hr limit) |
skx |
1,160 | 48 | 192 GB | -- | Skylake CPU nodes |
icx |
224 | 80 | 256 GB | -- | Ice Lake CPU nodes |
spr |
616 | 112 | 128 GB HBM | -- | Sapphire Rapids HBM nodes |
nvdimm |
3 | 80 | 4 TB | -- | Large-memory Ice Lake nodes |
h100 |
24 | 96 | 1 TB | 4x NVIDIA H100 | GPU nodes |
pvc |
20 | 96 | 1 TB | 4x Intel PVC | GPU nodes |
amd-rtx |
8 | 128 | ~1.5 TB | 8x RTX PRO 6000 | TR priority |
| Component | Status | Notes |
|---|---|---|
| Isaac Lab NGC container v2.3.2 | Working (Apptainer, recommended) | Best path for latest Isaac Lab on Stampede3: run nvcr.io/nvidia/isaac-lab:2.3.2 via Apptainer; see container guide |
| IsaacLab v2.1.0 | Working (legacy path) | Older pip/micromamba install path; use container above for latest Isaac Lab version. See setup guide |
| Isaac Sim 5.1.x (source build) | Working (with caveats) | Build from source on GLIBC 2.34 nodes; see Isaac Sim source-build guide |
| PyTorch (nightly, cu128) | Working | Default PyTorch 2.5.1 does not support Blackwell; see PyTorch GPU guide |
| vLLM 0.15 | Working | LLM inference server with TP/DP support; see vLLM guide |
| Isaac Sim 4.5.0 | Working | pip install |
See the Stampede3 access documentation. You will need TACC Multi-Factor Authentication -- you will be prompted for a 2FA code at login.
ssh <username>@stampede3.tacc.utexas.eduThe idev command is the easiest way to get an interactive shell on a compute node:
idev -p amd-rtx # default time (usually 30 min)
idev -p amd-rtx -t 2:30:00 # 2.5 hoursOnce allocated, you can see your assigned node and SSH directly to it:
squeue -u $USER # find the node name (NODELIST column)
ssh <node-name> # e.g. ssh c571-002This is useful when you need multiple terminals on the same compute node.
From the TACC docs on monitoring with sinfo, this command gives a compact overview of all queues:
$ sinfo -S+P -o "%18P %8a %20F"
PARTITION AVAIL NODES(A/I/O/T)
amd-rtx up 1/7/0/8
h100 up 14/4/6/24
icx up 206/8/10/224
nvdimm up 2/1/0/3
pvc up 6/10/4/20
skx up 1091/5/64/1160
skx-dev* up 10/35/27/72
spr up 255/237/124/616The NODES(A/I/O/T) column shows Allocated / Idle / Other (down, drained, etc.) / Total. In the example above, the amd-rtx queue has 1 node in use, 7 idle, and 8 total.
For per-node detail on our queue, use sinfo -Nel -p amd-rtx.
sbatch my_job.slurm # submit a batch job
squeue -u $USER # your running/pending jobs
scancel <job_id> # cancel a jobSee the TACC job submission docs for full details.
Stampede3 uses Lmod for software management:
module list # currently loaded modules
module spider <keyword> # search available modules
module load nvidia # load NVIDIA stack (CUDA, OpenMPI, etc.)
module reset # reset to system defaultsSee the Stampede3 file systems documentation for complete details.
| Path | Quota | Backed Up | Purge Policy |
|---|---|---|---|
$HOME |
15 GB, 300K files | Yes | -- |
$WORK |
1 TB, 3M files (shared across all TACC systems) | No | -- |
$SCRATCH |
No quota (~10 PB total) | No | Files not accessed in 10 days may be purged |
Check your usage:
/usr/local/etc/taccinfoExample output:
--------------------- Project balances for user joydeepb ----------------------
| Name Avail SUs Expires | |
| IRI26004 99508 2027-02-04 | |
------------------------ Disk quotas for user joydeepb ------------------------
| Disk Usage (GB) Limit %Used File Usage Limit %Used |
| /scratch 24.0 0.0 0.00 8383 0 0.00 |
| /home1 0.2 14.0 1.75 398 500000 0.08 |
| /work 241.1 1024.0 23.54 849645 3000000 28.32 |
-------------------------------------------------------------------------------Handy navigation aliases (built-in on TACC systems):
cdh # cd $HOME
cdw # cd $WORK
cds # cd $SCRATCH$HOME: Small config files and dotfiles only. Do not install software here.$WORK: Persistent software installs, conda/micromamba environments, cloned repos, and datasets you need long-term. Shared across TACC systems via Stockyard.$SCRATCH: Large training outputs, checkpoints, and temporary data. Fast I/O, but files are purged after 10 days of inactivity. Do not use as long-term storage.- Avoid many small file operations on
$HOMEand$WORK-- they are not designed for high-throughput I/O. - If you need more than 1 TB of persistent storage, see TACC Corral.
We recommend micromamba for managing Python environments. It is a standalone C++ binary that resolves and installs conda packages significantly faster than conda or mamba, with no base environment overhead.
# Install micromamba (one-time)
"${SHELL}" <(curl -L micro.mamba.pm/install.sh)
# Create an environment
micromamba create -n myenv python=3.10 -c conda-forge -y
# Activate / deactivate
micromamba activate myenv
micromamba deactivateInstall environments into $WORK so they persist and are available across login and compute nodes.
| Guide | Description |
|---|---|
| PyTorch GPU Environment | Set up micromamba + PyTorch with GPU support on Blackwell nodes. Includes an interactive verification walkthrough and an sbatch training script. |
| vLLM LLM Serving | Install vLLM and serve LLMs with single-GPU, data-parallel (8 GPU), and tensor-parallel configurations. Includes sbatch scripts for 14B single-GPU, 14B data-parallel, and 32B tensor-parallel. |
| Isaac Lab NGC Container | Recommended for latest Isaac Lab. Run NVIDIA's nvcr.io/nvidia/isaac-lab:2.3.2 image on Stampede3 using Apptainer. Includes interactive launch helpers and an sbatch video-training example. |
| IsaacLab on Stampede3 | Legacy install guide for IsaacLab v2.1.0 + Isaac Sim 4.5.0 via pip/micromamba. Includes an sbatch script. |
| Isaac Sim Source Build (GLIBC 2.34) | Build Isaac Sim from source on Blackwell nodes where pip wheels are incompatible. Includes a full automation script (build_isaacsim_stampede.sh) and a warehouse SDG smoke test. |