fix collectBoundarySeeds: snapshot phaseFab to host before scanning by jameslehoux · Pull Request #272 · BASE-Laboratory/OpenImpala

jameslehoux · 2026-05-06T22:17:50Z

§3 of the profiling notebook still segfaulted on Colab T4 with 4.2.12 — this time the silent crash is in collectBoundarySeeds (FloodFill.cpp:45), which is called by PercolationCheck before the seed-planting phase 1 that 61cf635 already patched.

The function searches the inlet/outlet domain faces for cells whose phase matches phaseID and pushes those into host-side IntVect vectors. The search itself uses amrex::LoopOnCpu reading phase_arr(i, j, k, 0) — and on a CUDA build that Array4 view points at device memory, so a host loop reading through it segfaults the same way the previous host write sites did.

Fix follows the pattern used elsewhere when CPU code genuinely needs to walk iMultiFab data: snapshot phaseFab into a pinned-host iMultiFab once via amrex::MFInfo().SetArena(amrex::The_Pinned_Arena()), copy device → host, sync, then the existing LoopOnCpu walks the host copy. On CPU builds the snapshot is skipped via #ifdef AMREX_USE_GPU and we just alias the input phaseFab.

The ParallelFor + DeviceVector approach used in the seed-planting fix isn't appropriate here because the output is a list of positions (not a fixed-size grid write); list-building reductions aren't a clean primitive in AMReX. The pinned-arena snapshot is one-time per PercolationCheck call and is cheap relative to the flood fill itself.

Other LoopOnCpu / device-memory sites that still need similar treatment (separate commits, none in the current §3 notebook hot path):

ConnectedComponents.cpp:43 — oi.connected_components only
Diffusion.cpp:127, :238 — native binary only (not Python)
TortuosityHypre.cpp:1012 — checkMatrixProperties() debug
io/DatReader.cpp:232 — oi.read_image with .dat input
io/RawReader.cpp:488 — oi.read_image with .raw input
io/TiffReader.cpp:555, :653 — oi.read_image with .tif input

These will surface in tutorials 2/4/7 (read_image workflows) on GPU; the fixes are likely the same pinned-arena snapshot or ParallelFor recipe once we hit them.

§3 of the profiling notebook still segfaulted on Colab T4 with 4.2.12 — this time the silent crash is in collectBoundarySeeds (FloodFill.cpp:45), which is called by PercolationCheck *before* the seed-planting phase 1 that 61cf635 already patched. The function searches the inlet/outlet domain faces for cells whose phase matches phaseID and pushes those into host-side IntVect vectors. The search itself uses amrex::LoopOnCpu reading phase_arr(i, j, k, 0) — and on a CUDA build that Array4<int> view points at device memory, so a host loop reading through it segfaults the same way the previous host *write* sites did. Fix follows the pattern used elsewhere when CPU code genuinely needs to walk iMultiFab data: snapshot phaseFab into a pinned-host iMultiFab once via amrex::MFInfo().SetArena(amrex::The_Pinned_Arena()), copy device → host, sync, then the existing LoopOnCpu walks the host copy. On CPU builds the snapshot is skipped via #ifdef AMREX_USE_GPU and we just alias the input phaseFab. The ParallelFor + DeviceVector approach used in the seed-planting fix isn't appropriate here because the output is a *list of positions* (not a fixed-size grid write); list-building reductions aren't a clean primitive in AMReX. The pinned-arena snapshot is one-time per PercolationCheck call and is cheap relative to the flood fill itself. Other LoopOnCpu / device-memory sites that still need similar treatment (separate commits, none in the current §3 notebook hot path): - ConnectedComponents.cpp:43 — oi.connected_components only - Diffusion.cpp:127, :238 — native binary only (not Python) - TortuosityHypre.cpp:1012 — checkMatrixProperties() debug - io/DatReader.cpp:232 — oi.read_image with .dat input - io/RawReader.cpp:488 — oi.read_image with .raw input - io/TiffReader.cpp:555, :653 — oi.read_image with .tif input These will surface in tutorials 2/4/7 (read_image workflows) on GPU; the fixes are likely the same pinned-arena snapshot or ParallelFor recipe once we hit them. https://claude.ai/code/session_011dJ5Bwq4Tnr8wxH597XJFf

github-actions · 2026-05-06T22:20:49Z

Performance Benchmark Results

Size	Solver	Wall Time (s)	Tortuosity	Expected	Rel. Error	Iters	Status
64³	pcg	0.7241	0.984375	0.984375	0.00e+00	1	PASS
64³	flexgmres	0.4326	0.984375	0.984375	0.00e+00	N/A	PASS
64³	bicgstab	0.4198	0.984375	0.984375	0.00e+00	N/A	PASS
64³	gmres	0.4216	0.984375	0.984375	0.00e+00	N/A	PASS
128³	pcg	8.8314	0.992188	0.992188	0.00e+00	1	PASS
128³	flexgmres	5.7744	0.992188	0.992188	0.00e+00	N/A	PASS
128³	bicgstab	5.6736	0.992188	0.992188	0.00e+00	N/A	PASS
128³	gmres	5.6557	0.992188	0.992188	0.00e+00	N/A	PASS

Fastest solver: bicgstab at 64³ (0.4198s)

Benchmark: uniform block (analytical τ = (N-1)/N)

github-actions · 2026-05-06T22:29:33Z

Code Coverage Report

------------------------------------------------------------------------------
                           GCC Code Coverage Report
Directory: .
------------------------------------------------------------------------------
File                                       Lines     Exec  Cover   Missing
------------------------------------------------------------------------------
src/io/CathodeWrite.cpp                       95       83    87%   40-41,97-100,115-116,182-185
src/io/CathodeWrite.H                          1        1   100%
src/io/DatReader.cpp                         135      105    77%   26-27,30,35,92-93,99-100,107-109,135-137,141,144-148,152-155,162,164,208-209,242,245
src/io/DatReader.H                             1        1   100%
src/io/HDF5Reader.cpp                        344       84    24%   40-41,43-44,46-49,52,54-56,58-59,62,64-66,68-74,92-93,126-128,144-145,154-157,174-180,182-187,204,213-215,217,219-228,230-233,236-238,240-251,253-258,266,266,266,266,266,266,266,270,270,270,270,270,270,270,274,276,278,280,282,288,290,297,297,297,297,297,297,297,301,301,301,301,301,301,301,305,305,305,305,305,305,305-306,306,306,306,306,306,306,309,309,309,309,309,309,309-310,310,310,310,310,310,310-311,311,311,311,311,311,311,313,313,313,313,313,313,313-314,314,314,314,314,314,314-315,315,315,315,315,315,315,319,319,319,319,319,319,319,324,324,324,324,324,324,324-325,325,325,325,325,325,325-326,326,326,326,326,326,326-327,327,327,327,327,327,327,332,332,332,332,332,332,332,337,337,337,337,337,337,337-338,338,338,338,338,338,338,343,343,343,343,343,343,343,350,350,350,350,350,350,350,357-358,432-435,437-440
src/io/HDF5Reader.H                            3        3   100%
src/io/ImageLoader.cpp                        61       42    68%   25,38,48,60-62,64-70,72,77,89-90,92,94
src/io/RawReader.cpp                         266      135    50%   49-50,89-90,111-112,115-117,120-121,140-142,155-157,166-168,174-177,185-186,192-196,200-204,209-212,219-224,231-237,271,273-274,276,283-284,301,312,314,318,325,327,331-334,338,346-347,353-355,361-363,365-366,369,372,374,377-380,382-384,386,388-389,391,393-394,396,398-399,401,403-404,406,410-411,413,417-418,420,425,465,471-472,521-524,538,540-542,544,546-548,558,562-564,566,588
src/io/RawReader.H                             1        1   100%
src/io/TiffReader.cpp                        384      130    33%   59-65,67-69,71-73,75-77,79-80,82-84,86-88,90-92,94-96,98-99,101-103,106-108,111-112,114-117,119,122,124-127,143-144,148-150,152-158,160,186,210,217,226,228-231,240,242-245,248,255,288-293,306,309-317,319-320,323-327,331-335,338-342,344-348,351-357,359-363,367,369,375-377,379-393,396,398-402,404-409,413-418,420-425,428-429,432-434,555-575,577-578,581-588,590,593-609,612-614,670,673-674,677-683,685,689-700,702-703
src/io/TiffReader.H                            5        5   100%
src/props/BoundaryCondition.H                131       74    56%   63,68,70,216,224-229,233-236,238-244,247-249,252-253,255,258-261,264-265,271-272,274-279,285-287,290-296,299,303,365-366,371,373
src/props/ConnectedComponents.cpp             69       67    97%   94-95
src/props/ConnectedComponents.H                4        4   100%
src/props/DeffTensor.cpp                      62       59    95%   122,128-129
src/props/Diffusion.cpp                      510      378    74%   93-94,97-98,103-104,106-116,118,123-132,134-141,144-150,153-157,159-163,165,168-173,175-177,179,182-184,186-187,190-191,193,195-198,200,202-203,288-289,297-298,300,349,359-360,368-371,373-375,404-413,415,453,461,465-467,526-527,533,535,539,547,581,610,638,646,735-736,739-740,757-760,771-772,774,824
src/props/EffDiffFillMtx.H                   120      106    88%   58,216-217,221-225,229,231-235
src/props/EffectiveDiffusivityHypre.cpp      413      372    90%   189-191,193-197,352-355,464,616-619,621-623,625-628,637-640,647,676,688-691,693-695,697,709,720,722
src/props/EffectiveDiffusivityHypre.H          7        7   100%
src/props/FloodFill.cpp                       90       87    96%   109-110,250
src/props/HypreStructSolver.cpp              343      210    61%   87-88,121,133-134,145,299,309,311,314,346,356,358,361,367-370,372-376,378-379,381-385,388-389,391-392,394,397-398,401-402,404-407,409-413,415-416,418-422,425-426,428-429,431,434-435,438-439,441-443,445-451,453-457,460-461,463-464,466,469-470,473,475-477,479-485,487-491,494-495,497-498,500,503-504,507,509-511,513-516,518-522,525-526,528-529,531,534-535,538,541-542,555
src/props/HypreStructSolver.H                  6        6   100%
src/props/MacroGeometry.H                     17       17   100%
src/props/ParticleSizeDistribution.cpp        11       11   100%
src/props/ParticleSizeDistribution.H           6        6   100%
src/props/PercolationCheck.cpp                53       46    86%   32-33,49-51,68,73
src/props/PercolationCheck.H                   4        4   100%
src/props/PhysicsConfig.H                     90       89    98%   150
src/props/ResultsJSON.H                      225      222    98%   242,395,416
src/props/REVStudy.cpp                       151      128    84%   72,83-91,159,170-173,175,183-186,188-190
src/props/SolverConfig.H                      32       20    62%   30,32,37-44,75-76
src/props/SpecificSurfaceArea.cpp             56       55    98%   59
src/props/SpecificSurfaceArea.H                6        6   100%
src/props/ThroughThicknessProfile.cpp         38       38   100%
src/props/ThroughThicknessProfile.H            5        5   100%
src/props/Tortuosity.H                         2        2   100%
src/props/TortuosityDirect.cpp               219      191    87%   81-83,86,100-106,113-114,125,134,140,202-209,226,394,424,433
src/props/TortuosityDirect.H                   5        5   100%
src/props/TortuosityHypre.cpp                794      566    71%   149-150,155-156,240-243,246-248,311,335-337,340-341,343,353-355,358-360,390-393,573,597,601,622,639-640,642-644,646-655,657,669,671-681,685-691,693-697,701-703,705-707,709-718,727,729-739,743-751,753-756,758,768,774-777,779-781,790-793,795-797,813,816-817,840-845,856-859,861,898,903-906,909-911,915-918,920,922-925,927,932-934,936,985,994,999,1002-1007,1023-1026,1040-1044,1049-1054,1064-1068,1073-1078,1083-1087,1090-1093,1100-1103,1114,1123,1125,1129,1131,1153,1199-1200,1286-1288,1414-1417
src/props/TortuosityHypre.H                   15       15   100%
src/props/TortuosityHypreFill.H              127       98    77%   85,203,205-212,237-239,241-245,247-248,250,252,255-256,258-262
src/props/TortuosityKernels.H                 97       53    54%   52,56-60,62-65,69-74,76-80,84-85,90,129,143,157,243,245-248,250-253,257-260,262-265
src/props/TortuosityMLMG.cpp                  99       91    91%   160,181-183,185-186,193,206
src/props/TortuosityMLMG.H                     1        1   100%
src/props/TortuositySolverBase.cpp           301      237    78%   70-72,74-75,94-101,104,106,142-145,200,203,205,255,280,298,327,391,394-396,398,406-409,411-417,422,427-429,435-436,438-440,454,460,464-465,467,478,492,496-498,500,502,506
src/props/TortuositySolverBase.H              13       13   100%
src/props/VolumeFraction.cpp                  25       25   100%
src/props/VolumeFraction.H                     4        4   100%
------------------------------------------------------------------------------
TOTAL                                       5447     3908    71%
------------------------------------------------------------------------------

Generated by CI — coverage data from gcovr

codecov · 2026-05-06T22:29:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions Bot added the physics label May 6, 2026

jameslehoux merged commit 6f5ca0b into master May 6, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix collectBoundarySeeds: snapshot phaseFab to host before scanning#272

fix collectBoundarySeeds: snapshot phaseFab to host before scanning#272
jameslehoux merged 1 commit intomasterfrom
claude/upbeat-mccarthy-f1mNN

jameslehoux commented May 6, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

codecov Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jameslehoux commented May 6, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Performance Benchmark Results

Uh oh!

github-actions Bot commented May 6, 2026

Code Coverage Report

Uh oh!

codecov Bot commented May 6, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant