Add GPU MegaParticles GICP distribution-to-distribution likelihood by rsasaki0109 · Pull Request #115 · rsasaki0109/CudaRobotics

rsasaki0109 · 2026-05-26T02:59:07Z

Summary

Adds a GICP-style distribution-to-distribution (D2D) scan likelihood to the
MegaParticles localization line — the "GICP-like point-cloud likelihood"
follow-up to the Stein (#86), explicit-LSH (#101) and 6-DoF SE(3) (#104) demos.
The earlier demos score a range scan against a precomputed distance field:
every endpoint is penalised by its isotropic distance to the nearest wall. That
is cheap but blurs surface structure — a point sliding along a wall is
penalised as hard as a point moving into it.

Here the map is instead a point cloud with per-point surface-aware
covariances (the GICP "disk": small variance along the surface normal, large
along the tangent). Each particle transforms its scan into the world, matches
every endpoint to the nearest map point through a uniform grid index, and scores
the Mahalanobis residual under the combined covariance
M = (C_map + R C_scan Rᵀ)⁻¹ (Segal et al., Generalized-ICP, RSS 2009),
summed over the scan. This is point-to-line / distribution-to-distribution: the
cost barely grows for tangential slip but rises sharply for normal-direction
error — the correct probabilistic weight near walls.

To isolate the likelihood this is a controlled head-to-head: two filters,
each 1,048,576 particles, sharing the identical MegaParticles machinery
(global uniform init, Gauss-Newton particle motion, sparse bucket-neighbor Stein
attraction/repulsion, posterior smoothing, representative-state gate, hidden
kidnap + 15-frame scan blackout recovery). Only the per-particle scoring kernel
differs:

Arm A — field proxy (control): the Add MegaParticles-style Stein MCL demo #86 distance-field endpoint likelihood.
Arm B — GICP D2D (new): surface-aware Mahalanobis scoring against the map
cloud, with a per-particle full 3×3 Gauss-Newton step driving the Stein motion.

The D2D likelihood is evaluated for one million particles by indexing the map
cloud (2,396 points) with a uniform grid, so each endpoint's nearest-neighbour
lookup only touches a 3×3 cell neighborhood.

Coarse-to-fine for robust recovery. A pure D2D likelihood (flat penalty +
zero gradient outside the match radius) re-localized the hidden kidnap only
intermittently: the sharper D2D contracts harder before the kidnap, leaving thin
global support, and a lost particle gets no gradient pull. So an unmatched ray
falls back to the distance-field endpoint log-likelihood (smooth long-range
pull) — the worst case becomes exactly the field filter, keeping global recovery
robust — while matched rays use the sharp surface-aware GICP term where the
accuracy gain comes from.

Results (SE(2), 2 × 1,048,576 particles, identical machinery, hidden kidnap)

metric	field proxy (#86 likelihood)	GICP D2D (new)
post-kidnap RMSE	0.099 m	0.064 m
final pose error	0.040 m	0.021 m
reacquisition after blackout	0 frames	0 frames
avg GPU step	4.9 ms	12.1 ms
map representation	340×240 distance field	2,396-point cloud + per-point disk cov

The field-proxy arm reproduces the original Stein MCL demo's ~0.097 m post-kidnap
RMSE (control validated). Numbers are stable across 4 runs to the GPU
atomicAdd-order noise floor (D2D post-kidnap RMSE 0.0639–0.0645 m). The
surface-aware likelihood roughly halves the steady-state error while keeping
the same robust 0-frame kidnap recovery, at ~2.4× per-step cost for the
grid-indexed nearest-neighbour search.

Test plan

cmake .. && make gpu_megaparticles_gicp_mcl -j builds clean (CUDA C++14, --expt-relaxed-constexpr).
Runs end to end, writes gif/gpu_megaparticles_gicp_mcl.gif (760×215, 1.6 MB).
Field-proxy arm reproduces the Add MegaParticles-style Stein MCL demo #86 distance-field result (~0.097 m post-kidnap RMSE).
GICP D2D arm recovers the hidden kidnap in 0 frames and beats the field proxy on post-kidnap RMSE and final error, reproducibly across 4 runs.
GIF deployed to gh-pages root (HTTP 200).

Notes

One demo = one .cu file; reuses include/cuda_check.cuh and include/cuda_video.h.
SE(2) demo, deliberately scoped: the map cloud, per-point disk covariances,
combined-covariance Mahalanobis scoring and per-particle Gauss-Newton are the
GICP D2D substance; the distance-field fallback is what makes it robust enough
for global kidnap recovery at this particle count.

New src/gpu_megaparticles_gicp_mcl.cu runs a controlled head-to-head of two 1,048,576-particle MegaParticles filters that share identical machinery (global uniform init, Gauss-Newton particle motion, sparse bucket-neighbor Stein attraction/repulsion, posterior smoothing, representative-state gate, hidden kidnap + scan blackout recovery) and differ only in the per-particle scan-scoring kernel. Arm A is the distance-field endpoint proxy of the original Stein MCL demo (control). Arm B is a GICP-style distribution-to-distribution likelihood: the map is a point cloud with per-point disk covariances (small variance along the surface normal, large along the tangent), indexed by a uniform NN grid; each particle matches every scan endpoint to the nearest map point and scores the surface-aware Gaussian log-likelihood under the combined covariance M = (C_map + R C_scan R^T)^{-1} (Segal et al., RSS 2009), with a per-particle 3x3 Gauss-Newton step driving the Stein motion. Unmatched rays fall back to the distance-field log-likelihood, giving a smooth long-range gradient so a globally lost particle is still pulled toward structure (robust kidnap recovery); matched rays use the sharp surface-aware term for accuracy. Both arms recover the hidden kidnap in 0 frames; the surface-aware GICP D2D likelihood lowers post-kidnap RMSE 0.099 m -> 0.064 m and final error 0.040 m -> 0.021 m versus the field proxy, at ~2.4x per-step cost (4.9 ms -> 12.1 ms) for the grid-indexed nearest-neighbour search.

Replace the hand-tuned representative-state continuity gate carried by the MegaParticles line (#86/#101/#104/#115) with a principled robust fixed-lag smoother, reporting raw max-posterior vs smoothed pose error separately. The GPU runs the expensive part exactly as #86 (1,048,576 particles, distance-field likelihood, bucket-neighbor Stein motion, posterior smoothing) and emits one raw max-posterior representative pose per frame. A lightweight host backend keeps a sliding window of the last 10 frames and jointly optimises a smoothed pose chain by IRLS Gauss-Newton with switchable CV-motion factors (a genuine kidnap discontinuity breaks the link instead of being smeared) and Huber-robust measurement factors (one-frame spurious max-posterior spikes are rejected). A frame is finalized once it falls off the window head. A robust smoother alone cannot distinguish a sustained new-location measurement (kidnap) from an outlier, so it stuck to the coasted old trajectory after the kidnap. Fix: a data-driven reset that fires only when measurements resume far from the coast after a measurement dropout, distinguishing a genuine relocalization from the high-confidence spurious-mode flips during tracking (those stay rejected as outliers). Controlled comparison, 4 runs (GPU atomicAdd noise floor): in-track jitter (mean |d2 pos|) raw 4.31 -> smoothed ~0.06 (~70x, truth 0.0055), in-track RMSE raw 5.4 -> smoothed ~0.25 m, post-kidnap RMSE raw ~1.2-1.9 -> smoothed ~0.09 m, recovers the hidden kidnap in 0 frames; host backend adds negligible cost (GPU step ~5 ms).

rsasaki0109 marked this pull request as ready for review May 26, 2026 03:12

rsasaki0109 merged commit 0318b9f into master May 26, 2026
1 check passed

rsasaki0109 deleted the feat/gpu-megaparticles-gicp-d2d branch May 26, 2026 03:12

rsasaki0109 mentioned this pull request May 26, 2026

Add GPU MegaParticles representative-trajectory smoother #118

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU MegaParticles GICP distribution-to-distribution likelihood#115

Add GPU MegaParticles GICP distribution-to-distribution likelihood#115
rsasaki0109 merged 1 commit into
masterfrom
feat/gpu-megaparticles-gicp-d2d

rsasaki0109 commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rsasaki0109 commented May 26, 2026

Summary

Results (SE(2), 2 × 1,048,576 particles, identical machinery, hidden kidnap)

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant