Add GPU MegaParticles LSH neighbor index demo by rsasaki0109 · Pull Request #101 · rsasaki0109/CudaRobotics

rsasaki0109 · 2026-05-25T07:55:29Z

Summary

MegaParticles-style relocalization with an explicit p-stable LSH neighbor index, replacing the fixed-grid neighbor stand-in of the earlier gpu_megaparticles_stein_mcl demo. The grid was a single axis-aligned partition, so particles near a cell boundary never aggregated with their true neighbors one cell over. This demo uses the actual Datar et al. (2004) p-stable LSH scheme: L=8 independent hash tables, each formed from K=3 random Gaussian projections of the 4-D pose feature (x, y, s·cos θ, s·sin θ) quantised at bin width r. Two particles are neighbors if they collide in at least one table; the random offsets and multiple tables recover the cross-boundary neighbors the grid misses.

Both filter paths are identical except for the neighbor structure — one million globally-distributed particles, the same range-field likelihood, the same Gauss-Newton-like per-particle step, the same posterior smoothing, the same shared coarse-grid representative-state readout, and the same hidden-kidnap blackout. The only independent variable is grid-neighbor vs LSH-neighbor aggregation, so the reported neighbor recall and post-kidnap RMSE isolate the contribution of the explicit LSH index.

Neighbor recall is measured directly: on a sampled particle pool, brute-force kNN within a fixed feature radius defines the ground-truth neighbor set, and each method is scored by the fraction of true neighbors it recovers (same-grid-cell vs collide-in-any-LSH-table).

Results

Metric	Fixed grid	Explicit LSH
Neighbor recall vs brute-force kNN	58.2%	87.8%
Post-kidnap RMSE	0.099 m	0.088 m
Reacquisition after blackout	0 frames	0 frames
Avg GPU step	4.9 ms	9.6 ms

The LSH index recovers ~30 points more of the true neighbor set (the multi-table OR overcomes the single grid's boundary misses), with comparable-to-slightly-better relocalization, at ~2× the per-step cost from the 8-table OR atomics — an honest trade the demo reports rather than hides.

Test plan

cmake .. && make gpu_megaparticles_lsh -j$(nproc) builds clean
./bin/gpu_megaparticles_lsh runs end-to-end, writes the GIF
neighbor recall 58.2% → 87.8%, post-kidnap RMSE 0.099 → 0.088 m
git diff --check clean (no whitespace errors)
GIF ≤ 3 MB (1.5 MB), deployed to gh-pages, URL returns HTTP 200

Replace the fixed-grid neighbor stand-in of the earlier MegaParticles-style demo with an explicit p-stable LSH neighbor index (Datar et al. 2004): L=8 independent hash tables, each from K=3 random Gaussian projections of the 4-D pose feature quantised at bin width r, with collision in any table defining a neighbor. A controlled head-to-head comparison runs two 1M-particle filters with identical Stein machinery, likelihood, posterior smoothing, and representative-state readout, so the only independent variable is the neighbor structure. Neighbor recall vs brute-force kNN rises 58.2% -> 87.8% as the random multi-table OR recovers the cross-boundary neighbors the single grid misses; post-kidnap relocalization RMSE 0.099 -> 0.088 m, both reacquire in 0 frames. LSH costs 9.6 ms vs grid 4.9 ms per step (8-table OR atomics).

Replace the hand-tuned representative-state continuity gate carried by the MegaParticles line (#86/#101/#104/#115) with a principled robust fixed-lag smoother, reporting raw max-posterior vs smoothed pose error separately. The GPU runs the expensive part exactly as #86 (1,048,576 particles, distance-field likelihood, bucket-neighbor Stein motion, posterior smoothing) and emits one raw max-posterior representative pose per frame. A lightweight host backend keeps a sliding window of the last 10 frames and jointly optimises a smoothed pose chain by IRLS Gauss-Newton with switchable CV-motion factors (a genuine kidnap discontinuity breaks the link instead of being smeared) and Huber-robust measurement factors (one-frame spurious max-posterior spikes are rejected). A frame is finalized once it falls off the window head. A robust smoother alone cannot distinguish a sustained new-location measurement (kidnap) from an outlier, so it stuck to the coasted old trajectory after the kidnap. Fix: a data-driven reset that fires only when measurements resume far from the coast after a measurement dropout, distinguishing a genuine relocalization from the high-confidence spurious-mode flips during tracking (those stay rejected as outliers). Controlled comparison, 4 runs (GPU atomicAdd noise floor): in-track jitter (mean |d2 pos|) raw 4.31 -> smoothed ~0.06 (~70x, truth 0.0055), in-track RMSE raw 5.4 -> smoothed ~0.25 m, post-kidnap RMSE raw ~1.2-1.9 -> smoothed ~0.09 m, recovers the hidden kidnap in 0 frames; host backend adds negligible cost (GPU step ~5 ms).

rsasaki0109 marked this pull request as ready for review May 25, 2026 08:12

rsasaki0109 merged commit 2c125b1 into master May 25, 2026
1 check passed

rsasaki0109 deleted the feat/gpu-megaparticles-lsh branch May 25, 2026 08:13

This was referenced May 25, 2026

Add GPU MegaParticles 6-DoF SE(3) relocalization demo #104

Merged

Add GPU MegaParticles GICP distribution-to-distribution likelihood #115

Merged

Add GPU MegaParticles representative-trajectory smoother #118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU MegaParticles LSH neighbor index demo#101

Add GPU MegaParticles LSH neighbor index demo#101
rsasaki0109 merged 1 commit into
masterfrom
feat/gpu-megaparticles-lsh

rsasaki0109 commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rsasaki0109 commented May 25, 2026

Summary

Results

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant