Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,10 @@ add_executable(gpu_multiagent_graph_neural_mppi src/gpu_multiagent_graph_neural_
target_link_libraries(gpu_multiagent_graph_neural_mppi ${OpenCV_LIBS})
target_compile_options(gpu_multiagent_graph_neural_mppi PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:--expt-relaxed-constexpr --use_fast_math>)

add_executable(gpu_priority_graph_neural_mppi src/gpu_priority_graph_neural_mppi.cu)
target_link_libraries(gpu_priority_graph_neural_mppi ${OpenCV_LIBS})
target_compile_options(gpu_priority_graph_neural_mppi PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:--expt-relaxed-constexpr --use_fast_math>)

add_executable(gpu_interaction_graph_risk src/gpu_interaction_graph_risk.cu)
target_link_libraries(gpu_interaction_graph_risk ${OpenCV_LIBS})
target_compile_options(gpu_interaction_graph_risk PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:--expt-relaxed-constexpr>)
Expand Down
Binary file added gif/gpu_priority_graph_neural_mppi.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 6 additions & 5 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,14 @@ Same algorithm on CPU and GPU — GPU enables orders of magnitude more particles
| Crowd / swarm | `gpu_crowd_swarm` | 10,000 boids with uniform-grid neighbours | 105x vs CPU |
| Graph policy control | `gpu_gnn_swarm_controller`, `gpu_gat_traversability_policy` | 2048 agents / 3072 terrain nodes x 3 heads | 2.88 ms/control; 99.4x GAT policy |
| Assignment / tracking | `gpu_hungarian_assignment`, `gpu_assignment_tracking` | 512 x 64x64 assignment / 128 tracking scenes | 158x Hungarian; 14.0x tracking |
| Interaction graph risk | `gpu_interaction_graph_risk`, `gpu_interaction_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi` | 2048 agents x 10 message passes / 48-agent graph x 4 passes / 48 robots x 768 MPPI rollouts | 76.3x risk propagation; interaction-aware MPPI reduces social risk 19.7%; multi-agent graph MPPI cuts cross-route collisions 518 -> 261 |
| Risk-aware planning | `gpu_reciprocal_risk_planner`, `gpu_interaction_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi` | 1024 agents x 9 actions x H=16 / 32768 social-risk MPPI rollouts / 48 robot coordinated MPPI | 4.05 ms/plan; 311.5x reciprocal risk; 4140.9x interaction-graph MPPI; 3139.6x multi-agent graph MPPI |
| Interaction graph risk | `gpu_interaction_graph_risk`, `gpu_interaction_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi`, `gpu_priority_graph_neural_mppi` | 2048 agents x 10 message passes / 48-agent graph x 4 passes / 48 robots x 768 MPPI rollouts | 76.3x risk propagation; interaction-aware MPPI reduces social risk 19.7%; multi-agent graph MPPI cuts cross-route collisions 518 -> 261; priority arbitration cuts 261 -> 245 and clears the deadlock |
| Risk-aware planning | `gpu_reciprocal_risk_planner`, `gpu_interaction_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi`, `gpu_priority_graph_neural_mppi` | 1024 agents x 9 actions x H=16 / 32768 social-risk MPPI rollouts / 48 robot coordinated and priority MPPI | 4.05 ms/plan; 311.5x reciprocal risk; 4140.9x interaction-graph MPPI; 3139.6x multi-agent graph MPPI; 2870.5x priority graph MPPI |
| SfM / multi-view | `gpu_sfm_mini` | 2048 features x 4 views | 217.0x match + BA vs CPU |
| Sparse linear solvers | `gpu_pcg_solver` | 262K unknowns / 1.31M CSR nnz | 13.4x Jacobi-PCG vs CPU |
| Clustering / graph ML | `gpu_em_gmm`, `gpu_spectral_clustering`, `gpu_label_propagation`, `gpu_label_propagation_traversability`, `gpu_graph_crf_traversability` | 262K GMM points / 3K graph nodes | 90.2x EM; 193x spectral; 123x propagation; 106x CRF |
| Black-box optimization | `gpu_cma_es` | 3 x 32,768 candidates x 10D | 1,254x objective eval |
| Monte Carlo planning | `gpu_mcts_planner` | 64 scenes x 4096 rollouts x 48 horizon | 712x vs CPU |
| Learning-based planning | `gpu_diffusion_planner`, `gpu_diffusion_policy`, `gpu_diff_value_iteration_traversability`, `gpu_neural_astar_traversability`, `gpu_anytime_neural_astar_traversability`, `gpu_multigoal_neural_astar_traversability`, `gpu_spatiotemporal_neural_astar_traversability`, `gpu_experience_graph_neural_planner`, `gpu_graph_guided_neural_mppi`, `gpu_kinodynamic_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi` | 512 x 64 trajectories / 192x128 soft VI / 64x neural A* / 1536-node graph / 32768 MPPI rollouts / 48 robot graph MPPI | analytic score -> BC denoising policy; 747.4x learned-cost VI; 153.1x batched neural A*; 278.5x experience-graph A*; 1320.1x graph-guided MPPI; 3139.6x multi-agent graph MPPI |
| Learning-based planning | `gpu_diffusion_planner`, `gpu_diffusion_policy`, `gpu_diff_value_iteration_traversability`, `gpu_neural_astar_traversability`, `gpu_anytime_neural_astar_traversability`, `gpu_multigoal_neural_astar_traversability`, `gpu_spatiotemporal_neural_astar_traversability`, `gpu_experience_graph_neural_planner`, `gpu_graph_guided_neural_mppi`, `gpu_kinodynamic_graph_neural_mppi`, `gpu_multiagent_graph_neural_mppi`, `gpu_priority_graph_neural_mppi` | 512 x 64 trajectories / 192x128 soft VI / 64x neural A* / 1536-node graph / 32768 MPPI rollouts / 48 robot graph MPPI | analytic score -> BC denoising policy; 747.4x learned-cost VI; 153.1x batched neural A*; 278.5x experience-graph A*; 1320.1x graph-guided MPPI; 2870.5x priority graph MPPI |
| Voxel map (3D) | `comparison_voxel_map` | 256x256x32 | 58x per ray |
| ESDF (2D/3D) | `comparison_esdf`, `comparison_esdf_3d` | 640K cells / 1.05M voxels | 53,404x / 86,613x |
| LiDAR sim | `comparison_lidar_sim`, `comparison_lidar3d_sim`, `comparison_lidar3d_realistic` | 1M 2D / 131K 3D rays | + 5 physical effects (realistic) |
Expand Down Expand Up @@ -88,8 +88,8 @@ Same algorithm on CPU and GPU — GPU enables orders of magnitude more particles
| <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_graph_guided_neural_mppi.gif" width="400"/> | <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_kinodynamic_graph_neural_mppi.gif" width="400"/> |
| **GPU interaction-graph neural MPPI (48-agent message-passing social risk + 32768 rollouts, social risk 1.628 -> 1.308, 4140.9x vs CPU)** | **GPU multi-agent graph-neural MPPI (48 robots × 768 rollouts × H=72, cross-route collisions 518 -> 261, social risk 3.544 -> 2.588, 3139.6x vs CPU)** |
| <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_interaction_graph_neural_mppi.gif" width="400"/> | <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_multiagent_graph_neural_mppi.gif" width="400"/> |
| **GPU Multi-Robot Planner (200 robots, parallel BF distance fields)** | **Massive Collision Check (1M segments, 1,277x)** |
| <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_multi_robot_planner.gif" width="400"/> | <img src="https://rsasaki0109.github.io/CudaRobotics/comparison_collision_check.gif" width="400"/> |
| **GPU priority graph-neural MPPI (right-of-way arbitration, collisions 261 -> 245, reach 36/48 -> 40/48, 2870.5x vs CPU)** | **GPU Multi-Robot Planner (200 robots, parallel BF distance fields)** |
| <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_priority_graph_neural_mppi.gif" width="400"/> | <img src="https://rsasaki0109.github.io/CudaRobotics/gpu_multi_robot_planner.gif" width="400"/> |
| **Massive RRT* Rewire (CPU 2K vs CUDA 200K nodes)** | **3D ESDF (32³ CPU vs 128²×64 CUDA, 86,613x)** |
| <img src="https://rsasaki0109.github.io/CudaRobotics/comparison_rrtstar_rewire.gif" width="400"/> | <img src="https://rsasaki0109.github.io/CudaRobotics/comparison_esdf_3d.gif" width="400"/> |
| **GPU diffusion policy (768-sample BC MLP prior + diffusion refinement, 512×64 paths)** | **GPU diffusion planner (512 trajectories × 64 waypoints, 120 Langevin steps, 0.03 ms/step)** |
Expand Down Expand Up @@ -209,6 +209,7 @@ cd ros2_ws && colcon build --packages-select cuda_robotics
| GPU kinodynamic graph-neural MPPI | 32768 nonholonomic speed/steering rollouts x H=72 x guided/unguided batches, cost 1516.74 -> 851.11, terminal error 5.11 -> 0.88, route error 1.530 -> 0.252, **49.9x** vs CPU equivalent kinodynamic rollout evaluation |
| GPU interaction-graph neural MPPI | 48 moving agents x 4 message-passing risk updates + 32768 MPPI rollouts x H=72, social risk 1.628 -> 1.308, clearance -0.15 -> -0.10, full objective 2913.50 -> 2395.14, **4140.9x** vs CPU equivalent rollout evaluation |
| GPU multi-agent graph-neural MPPI | 48 robots x 768 rollouts x H=72 x independent/coordinated modes, cross-route collisions 518 -> 261, social risk 3.544 -> 2.588, reach basin 48/48 -> 36/48, **3139.6x** vs CPU equivalent rollout evaluation |
| GPU priority graph-neural MPPI | 48 robots x 768 rollouts x H=72 x coordinated/priority modes, right-of-way arbitration cuts cross-route collisions 261 -> 245, reach basin 36/48 -> 40/48, deadlocks 1 -> 0, terminal error 1.97 -> 1.65, **2870.5x** vs CPU equivalent rollout evaluation |
| GPU GNN swarm controller | 2048 agents x 3 message passes, 2.88 ms/control, **44.3x** vs CPU |
| GPU reciprocal risk planner | 1024 agents x 9 actions x H=16, 4.05 ms/plan, **311.5x** vs CPU |
| GPU assignment tracking | 128 scenes x 48 tracks x 72 detections, **14.0x** vs CPU |
Expand Down
Loading
Loading