Add GPU MegaParticles LSH neighbor index demo#101
Merged
Conversation
Replace the fixed-grid neighbor stand-in of the earlier MegaParticles-style demo with an explicit p-stable LSH neighbor index (Datar et al. 2004): L=8 independent hash tables, each from K=3 random Gaussian projections of the 4-D pose feature quantised at bin width r, with collision in any table defining a neighbor. A controlled head-to-head comparison runs two 1M-particle filters with identical Stein machinery, likelihood, posterior smoothing, and representative-state readout, so the only independent variable is the neighbor structure. Neighbor recall vs brute-force kNN rises 58.2% -> 87.8% as the random multi-table OR recovers the cross-boundary neighbors the single grid misses; post-kidnap relocalization RMSE 0.099 -> 0.088 m, both reacquire in 0 frames. LSH costs 9.6 ms vs grid 4.9 ms per step (8-table OR atomics).
This was referenced May 25, 2026
rsasaki0109
added a commit
that referenced
this pull request
May 26, 2026
Replace the hand-tuned representative-state continuity gate carried by the MegaParticles line (#86/#101/#104/#115) with a principled robust fixed-lag smoother, reporting raw max-posterior vs smoothed pose error separately. The GPU runs the expensive part exactly as #86 (1,048,576 particles, distance-field likelihood, bucket-neighbor Stein motion, posterior smoothing) and emits one raw max-posterior representative pose per frame. A lightweight host backend keeps a sliding window of the last 10 frames and jointly optimises a smoothed pose chain by IRLS Gauss-Newton with switchable CV-motion factors (a genuine kidnap discontinuity breaks the link instead of being smeared) and Huber-robust measurement factors (one-frame spurious max-posterior spikes are rejected). A frame is finalized once it falls off the window head. A robust smoother alone cannot distinguish a sustained new-location measurement (kidnap) from an outlier, so it stuck to the coasted old trajectory after the kidnap. Fix: a data-driven reset that fires only when measurements resume far from the coast after a measurement dropout, distinguishing a genuine relocalization from the high-confidence spurious-mode flips during tracking (those stay rejected as outliers). Controlled comparison, 4 runs (GPU atomicAdd noise floor): in-track jitter (mean |d2 pos|) raw 4.31 -> smoothed ~0.06 (~70x, truth 0.0055), in-track RMSE raw 5.4 -> smoothed ~0.25 m, post-kidnap RMSE raw ~1.2-1.9 -> smoothed ~0.09 m, recovers the hidden kidnap in 0 frames; host backend adds negligible cost (GPU step ~5 ms).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MegaParticles-style relocalization with an explicit p-stable LSH neighbor index, replacing the fixed-grid neighbor stand-in of the earlier
gpu_megaparticles_stein_mcldemo. The grid was a single axis-aligned partition, so particles near a cell boundary never aggregated with their true neighbors one cell over. This demo uses the actual Datar et al. (2004) p-stable LSH scheme:L=8independent hash tables, each formed fromK=3random Gaussian projections of the 4-D pose feature(x, y, s·cos θ, s·sin θ)quantised at bin widthr. Two particles are neighbors if they collide in at least one table; the random offsets and multiple tables recover the cross-boundary neighbors the grid misses.Both filter paths are identical except for the neighbor structure — one million globally-distributed particles, the same range-field likelihood, the same Gauss-Newton-like per-particle step, the same posterior smoothing, the same shared coarse-grid representative-state readout, and the same hidden-kidnap blackout. The only independent variable is grid-neighbor vs LSH-neighbor aggregation, so the reported neighbor recall and post-kidnap RMSE isolate the contribution of the explicit LSH index.
Neighbor recall is measured directly: on a sampled particle pool, brute-force kNN within a fixed feature radius defines the ground-truth neighbor set, and each method is scored by the fraction of true neighbors it recovers (same-grid-cell vs collide-in-any-LSH-table).
Results
The LSH index recovers ~30 points more of the true neighbor set (the multi-table OR overcomes the single grid's boundary misses), with comparable-to-slightly-better relocalization, at ~2× the per-step cost from the 8-table OR atomics — an honest trade the demo reports rather than hides.
Test plan
cmake .. && make gpu_megaparticles_lsh -j$(nproc)builds clean./bin/gpu_megaparticles_lshruns end-to-end, writes the GIFgit diff --checkclean (no whitespace errors)