Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ jobs:
import openimpala as oi

sizes = [int(s) for s in os.environ.get("BENCH_SIZES", "64,128").split(",")]
solvers = ["pcg", "flexgmres", "bicgstab", "gmres", "pfmg", "mlmg"]
solvers = ["pcg", "flexgmres", "bicgstab", "gmres"]
n_repeats = 3
results = []

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ For **GPU acceleration** (NVIDIA CUDA), install `openimpala-cuda` from GitHub Re

```bash
pip install openimpala-cuda --find-links \
https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/
https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6
```

To install with optional dependencies:
Expand Down
4 changes: 2 additions & 2 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ pip install openimpala
# GPU version (requires NVIDIA CUDA runtime)
# GPU wheels are distributed via GitHub Releases due to their size (~300 MB).
pip install openimpala-cuda --find-links \
https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/
https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6
```

**Requirements:** Python 3.8+ and NumPy. Optional: `mpi4py` for MPI parallelism.
Expand All @@ -25,7 +25,7 @@ For HPC clusters, download the pre-built Apptainer/Singularity container from

```bash
# Download the latest .sif file
wget https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/openimpala-v4.0.0.sif
wget https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6openimpala-v4.0.0.sif

# Run interactively
apptainer shell openimpala-v4.0.0.sif
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Install from PyPI

# GPU version (NVIDIA CUDA) — distributed via GitHub Releases
pip install openimpala-cuda --find-links \
https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/
https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6

.. toctree::
:maxdepth: 2
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ flood fills, and solver loops are GPU-compatible.
```bash
# GPU wheels are distributed via GitHub Releases due to their size (~300 MB).
pip install openimpala-cuda --find-links \
https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/
https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6
```

The GPU wheel requires:
Expand Down
2 changes: 1 addition & 1 deletion notebooks/profiling_and_tuning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
"source": [
"# Install system MPI and Python packages\n",
"!apt-get install -y libopenmpi-dev > /dev/null 2>&1\n",
"!pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/[all] > /dev/null 2>&1\n",
"!pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6[all] > /dev/null 2>&1\n",
"!pip install porespy > /dev/null 2>&1\n",
"print(\"Dependencies installed.\")"
]
Expand Down
2 changes: 1 addition & 1 deletion paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ with oi.Session():
print(f"Tortuosity: {result.tortuosity:.4f}")
```

Pre-compiled CPU wheels are distributed via PyPI (`pip install openimpala`) and CUDA GPU wheels via GitHub Releases (`pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/`), both built using `cibuildwheel` with statically linked dependencies. Interactive tutorial notebooks are provided for Google Colab, covering workflows from basic tortuosity computation to digital twin parameterisation with PyBaMM. API reference documentation, installation guides, and interactive tutorial notebooks are available at https://base-laboratory.github.io/OpenImpala/
Pre-compiled CPU wheels are distributed via PyPI (`pip install openimpala`) and CUDA GPU wheels via GitHub Releases (`pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6`), both built using `cibuildwheel` with statically linked dependencies. Interactive tutorial notebooks are provided for Google Colab, covering workflows from basic tortuosity computation to digital twin parameterisation with PyBaMM. API reference documentation, installation guides, and interactive tutorial notebooks are available at https://base-laboratory.github.io/OpenImpala/

## Testing and Quality Assurance

Expand Down
5 changes: 2 additions & 3 deletions src/props/HypreStructSolver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -480,9 +480,8 @@ bool HypreStructSolver::runSolver(PrecondType precond_type) {
HYPRE_CHECK(ierr);
HYPRE_StructPFMGSetTol(solver, m_eps);
HYPRE_StructPFMGSetMaxIter(solver, m_maxiter);
HYPRE_StructPFMGSetNumPreRelax(solver, 2);
HYPRE_StructPFMGSetNumPostRelax(solver, 2);
HYPRE_StructPFMGSetRelaxType(solver, 2); // 2 = weighted Jacobi (more stable)
HYPRE_StructPFMGSetNumPreRelax(solver, 1);
HYPRE_StructPFMGSetNumPostRelax(solver, 1);
HYPRE_StructPFMGSetPrintLevel(solver, m_verbose > 1 ? 3 : 0);

ierr = HYPRE_StructPFMGSetup(solver, m_A, m_b, m_x);
Expand Down
83 changes: 11 additions & 72 deletions src/props/TortuosityMLMG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -79,39 +79,18 @@ bool TortuosityMLMG::solve() {
}
mlabec.setDomainBC(lo_bc, hi_bc);

// --- Adjust Dirichlet face values for HYPRE-compatible cell-centre BCs ---
//
// AMReX MLABecLaplacian applies Dirichlet BCs at domain faces (half a cell
// outside the boundary cell centre). The shared flux integration code
// (globalFluxes / value) expects the HYPRE convention where Dirichlet
// values are at boundary cell centres: cell 0 = vlo, cell N-1 = vhi.
//
// To make the MLMG face BC produce the same cell-centre values, we extend
// the face values outward by half a cell:
// face_lo = vlo - 0.5 * (vhi - vlo) / (N - 1)
// face_hi = vhi + 0.5 * (vhi - vlo) / (N - 1)
//
// This ensures the linear solution through cell centres hits exactly
// vlo at cell 0 and vhi at cell N-1, matching HYPRE's τ = (N-1)/N.
const amrex::Box& domain = m_geom.Domain();
const int n_cells = domain.length(idir);
if (n_cells <= 1) {
amrex::Abort("TortuosityMLMG: domain must have more than 1 cell in flow direction.");
}
const amrex::Real half_step = 0.5 * (m_vhi - m_vlo) / static_cast<amrex::Real>(n_cells - 1);
const amrex::Real face_vlo = m_vlo - half_step;
const amrex::Real face_vhi = m_vhi + half_step;

// Set initial guess: linear ramp in flow direction for better convergence
m_mf_solution.setVal(0.0);
{
const amrex::Box& domain = m_geom.Domain();
const int n_cells = domain.length(idir);
if (n_cells <= 1) {
amrex::Abort("TortuosityMLMG: domain must have more than 1 cell in flow direction.");
}
const int dom_lo_dir = domain.smallEnd(idir);
const int dom_hi_dir = domain.bigEnd(idir);
const amrex::Real vlo = face_vlo;
const amrex::Real vhi = face_vhi;
// Ramp from face_vlo at the low face to face_vhi at the high face.
// Cell centres at i map to fraction (i - dom_lo + 0.5) / n_cells.
const amrex::Real inv_n = 1.0 / static_cast<amrex::Real>(n_cells);
const amrex::Real vlo = m_vlo;
const amrex::Real vhi = m_vhi;
#ifdef AMREX_USE_OMP
#pragma omp parallel if (amrex::Gpu::notInLaunchRegion())
#endif
Expand All @@ -121,7 +100,8 @@ bool TortuosityMLMG::solve() {
amrex::ParallelFor(bx, [=] AMREX_GPU_DEVICE(int i, int j, int k) noexcept {
amrex::IntVect iv(i, j, k);
int idx_in_dir = iv[idir] - dom_lo_dir;
amrex::Real frac = (static_cast<amrex::Real>(idx_in_dir) + 0.5) * inv_n;
amrex::Real frac =
static_cast<amrex::Real>(idx_in_dir) / static_cast<amrex::Real>(n_cells - 1);
if (iv[idir] >= dom_lo_dir && iv[idir] <= dom_hi_dir) {
phi(i, j, k) = vlo + frac * (vhi - vlo);
} else if (iv[idir] < dom_lo_dir) {
Expand All @@ -134,7 +114,7 @@ bool TortuosityMLMG::solve() {
}
m_mf_solution.FillBoundary(m_geom.periodicity());

// Set level BC (ghost cell values encode the Dirichlet face data)
// Set level BC (ghost cell values encode the Dirichlet data)
mlabec.setLevelBC(0, &m_mf_solution);

// Set coefficients: alpha*a - beta*div(B*grad)
Expand All @@ -145,48 +125,7 @@ bool TortuosityMLMG::solve() {
acoef.setVal(0.0);
mlabec.setACoeffs(0, acoef);

// B-coefficients: face-centred diffusivities via harmonic mean.
//
// First, extrapolate m_mf_diff_coeff into physical boundary ghost cells
// so that boundary-face harmonic means see the correct value (the adjacent
// interior cell's D) instead of the default 0. Without this, boundary
// faces get B=0, effectively imposing zero-flux Neumann everywhere and
// preventing MLMG from enforcing Dirichlet BCs properly.
{
const amrex::Box& domain = m_geom.Domain();
#ifdef AMREX_USE_OMP
#pragma omp parallel if (amrex::Gpu::notInLaunchRegion())
#endif
for (amrex::MFIter mfi(m_mf_diff_coeff, amrex::TilingIfNotGPU()); mfi.isValid(); ++mfi) {
amrex::Array4<amrex::Real> const dc = m_mf_diff_coeff.array(mfi);
const amrex::Box& vbx = mfi.validbox();
for (int d = 0; d < AMREX_SPACEDIM; ++d) {
// Low boundary: copy interior value into ghost cell
if (vbx.smallEnd(d) == domain.smallEnd(d)) {
const amrex::Box lobx = amrex::adjCellLo(vbx, d, 1);
const int interior = domain.smallEnd(d);
amrex::ParallelFor(lobx, [=] AMREX_GPU_DEVICE(int i, int j, int k) noexcept {
amrex::IntVect iv(i, j, k);
amrex::IntVect iv_int = iv;
iv_int[d] = interior;
dc(i, j, k) = dc(iv_int);
});
}
// High boundary: copy interior value into ghost cell
if (vbx.bigEnd(d) == domain.bigEnd(d)) {
const amrex::Box hibx = amrex::adjCellHi(vbx, d, 1);
const int interior = domain.bigEnd(d);
amrex::ParallelFor(hibx, [=] AMREX_GPU_DEVICE(int i, int j, int k) noexcept {
amrex::IntVect iv(i, j, k);
amrex::IntVect iv_int = iv;
iv_int[d] = interior;
dc(i, j, k) = dc(iv_int);
});
}
}
}
}

// B-coefficients: face-centred diffusivities via harmonic mean
amrex::Array<amrex::MultiFab, AMREX_SPACEDIM> bcoefs;
for (int d = 0; d < AMREX_SPACEDIM; ++d) {
amrex::BoxArray edge_ba = m_ba;
Expand Down
2 changes: 1 addition & 1 deletion tutorials/01_hello_openimpala.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# Install OpenImpala, PoreSpy for structure generation, and Matplotlib for plots.\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ porespy matplotlib"
"source": "# Install OpenImpala, PoreSpy for structure generation, and Matplotlib for plots.\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 porespy matplotlib"
},
{
"cell_type": "code",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/02_digital_twin.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"outputs": [],
"source": [
"# Install OpenImpala, PyBaMM, and visualization utilities\n",
"!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ pybamm bpx tifffile matplotlib yt"
"!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 pybamm bpx tifffile matplotlib yt"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion tutorials/03_rev_and_uncertainty.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# Install OpenImpala and utilities\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ tifffile matplotlib scipy"
"source": "# Install OpenImpala and utilities\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 tifffile matplotlib scipy"
},
{
"cell_type": "code",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/04_multiphase_and_fields.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# Install OpenImpala, PoreSpy, yt (for AMReX plotfile visualisation), and Matplotlib.\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ porespy yt matplotlib"
"source": "# Install OpenImpala, PoreSpy, yt (for AMReX plotfile visualisation), and Matplotlib.\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 porespy yt matplotlib"
},
{
"cell_type": "code",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/05_surrogate_modelling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"outputs": [],
"source": [
"# Install OpenImpala and ML libraries\n",
"!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ porespy scikit-learn matplotlib"
"!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 porespy scikit-learn matplotlib"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion tutorials/06_topology_optimisation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ matplotlib"
"source": "!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 matplotlib"
},
{
"cell_type": "code",
Expand Down
4 changes: 2 additions & 2 deletions tutorials/07_hpc_scaling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/ porespy matplotlib"
"source": "!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 porespy matplotlib"
},
{
"cell_type": "code",
Expand All @@ -266,7 +266,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": "## Summary\n\n| Scenario | Approach |\n|----------|----------|\n| **Laptop / Colab** | `pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest/download/`, use NumPy arrays directly |\n| **Small cluster (1-16 cores)** | `mpirun -np 16 python script.py` with NumPy loading |\n| **Large cluster (16-128+ cores)** | `mpirun -np 128 python script.py` with `oi.read_image()` for parallel I/O |\n| **HPC without Python** | `mpirun -np 128 Diffusion ./inputs` (pure C++ application) |\n\n### Solver Quick Reference\n\n| Solver | Best For | Notes |\n|--------|----------|-------|\n| **FlexGMRES** | General use | Robust default, handles non-symmetric systems |\n| **PCG** | Symmetric problems | Fastest when applicable |\n| **SMG/PFMG** | Structured grids | Geometric multigrid, excellent on regular domains |\n| **BiCGSTAB** | Non-symmetric | Alternative to GMRES |\n\nThe API is the same at every scale \u2014 only the launch command changes. The scaling study and solver comparison in Sections 6-7 provide concrete data to guide your deployment decisions.\n\n**Back to the beginning:**\n- [Tutorial 1: Getting Started](01_hello_openimpala.ipynb) \u2014 Core workflow with synthetic microstructures.\n- [Tutorial 2: From 3D Image to Device Model](02_digital_twin.ipynb) \u2014 Load real CT scans and export to PyBaMM.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **AMReX:** W. Zhang et al., *AMReX: a framework for block-structured adaptive mesh refinement*, J. Open Source Software 4(37), 1370 (2019). [doi:10.21105/joss.01370](https://doi.org/10.21105/joss.01370)\n3. **AMReX scaling:** A. S. Almgren et al., *Block-structured adaptive mesh refinement \u2014 theory, implementation and application*, J. Comput. Physics 332, 1-28 (2017). [doi:10.1016/j.jcp.2016.12.073](https://doi.org/10.1016/j.jcp.2016.12.073)\n4. **HYPRE:** R. D. Falgout & U. M. Yang, *hypre: A library of high performance preconditioners*, Computational Science \u2014 ICCS 2002, LNCS 2331, pp. 632-641 (2002). [doi:10.1007/3-540-47789-6_66](https://doi.org/10.1007/3-540-47789-6_66)\n5. **Parallel HDF5:** The HDF Group, *HDF5 \u2014 Parallel I/O*, [hdfgroup.org](https://www.hdfgroup.org/solutions/hdf5/)\n6. **Apptainer/Singularity for HPC:** G. M. Kurtzer et al., *Singularity: Scientific containers for mobility of compute*, PLoS ONE 12(5), e0177459 (2017). [doi:10.1371/journal.pone.0177459](https://doi.org/10.1371/journal.pone.0177459)\n7. **MPI standard:** Message Passing Interface Forum, *MPI: A Message-Passing Interface Standard, Version 4.0* (2021). [mpi-forum.org](https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf)"
"source": "## Summary\n\n| Scenario | Approach |\n|----------|----------|\n| **Laptop / Colab** | `pip install openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6`, use NumPy arrays directly |\n| **Small cluster (1-16 cores)** | `mpirun -np 16 python script.py` with NumPy loading |\n| **Large cluster (16-128+ cores)** | `mpirun -np 128 python script.py` with `oi.read_image()` for parallel I/O |\n| **HPC without Python** | `mpirun -np 128 Diffusion ./inputs` (pure C++ application) |\n\n### Solver Quick Reference\n\n| Solver | Best For | Notes |\n|--------|----------|-------|\n| **FlexGMRES** | General use | Robust default, handles non-symmetric systems |\n| **PCG** | Symmetric problems | Fastest when applicable |\n| **SMG/PFMG** | Structured grids | Geometric multigrid, excellent on regular domains |\n| **BiCGSTAB** | Non-symmetric | Alternative to GMRES |\n\nThe API is the same at every scale \u2014 only the launch command changes. The scaling study and solver comparison in Sections 6-7 provide concrete data to guide your deployment decisions.\n\n**Back to the beginning:**\n- [Tutorial 1: Getting Started](01_hello_openimpala.ipynb) \u2014 Core workflow with synthetic microstructures.\n- [Tutorial 2: From 3D Image to Device Model](02_digital_twin.ipynb) \u2014 Load real CT scans and export to PyBaMM.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **AMReX:** W. Zhang et al., *AMReX: a framework for block-structured adaptive mesh refinement*, J. Open Source Software 4(37), 1370 (2019). [doi:10.21105/joss.01370](https://doi.org/10.21105/joss.01370)\n3. **AMReX scaling:** A. S. Almgren et al., *Block-structured adaptive mesh refinement \u2014 theory, implementation and application*, J. Comput. Physics 332, 1-28 (2017). [doi:10.1016/j.jcp.2016.12.073](https://doi.org/10.1016/j.jcp.2016.12.073)\n4. **HYPRE:** R. D. Falgout & U. M. Yang, *hypre: A library of high performance preconditioners*, Computational Science \u2014 ICCS 2002, LNCS 2331, pp. 632-641 (2002). [doi:10.1007/3-540-47789-6_66](https://doi.org/10.1007/3-540-47789-6_66)\n5. **Parallel HDF5:** The HDF Group, *HDF5 \u2014 Parallel I/O*, [hdfgroup.org](https://www.hdfgroup.org/solutions/hdf5/)\n6. **Apptainer/Singularity for HPC:** G. M. Kurtzer et al., *Singularity: Scientific containers for mobility of compute*, PLoS ONE 12(5), e0177459 (2017). [doi:10.1371/journal.pone.0177459](https://doi.org/10.1371/journal.pone.0177459)\n7. **MPI standard:** Message Passing Interface Forum, *MPI: A Message-Passing Interface Standard, Version 4.0* (2021). [mpi-forum.org](https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf)"
}
],
"metadata": {
Expand Down
Loading