From db9409a06b9b4786fa227312d9fb1fe3c9e6fc88 Mon Sep 17 00:00:00 2001
From: James Le Houx <james.lehoux@gre.ac.uk>
Date: Sat, 4 Apr 2026 12:00:14 +0000
Subject: [PATCH 1/3] Fix CuPy solver: use rtol parameter (not tol) for cg()

CuPy 13.x renamed the tolerance parameter from tol to rtol, matching
modern SciPy. The CuPy path was still using the old tol keyword.

https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk
---
 python/openimpala/_solver.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/openimpala/_solver.py b/python/openimpala/_solver.py
index 1a175101..790fc346 100644
--- a/python/openimpala/_solver.py
+++ b/python/openimpala/_solver.py
@@ -415,7 +415,7 @@ def solve_tortuosity(
 
         # CuPy CG solve
         solution_gpu, info = cusp_linalg.cg(A_gpu, rhs_gpu, x0=x0_gpu,
-                                             tol=tol, maxiter=maxiter)
+                                             rtol=tol, maxiter=maxiter)
         solution = cp.asnumpy(solution_gpu)
         converged = info == 0
         # CuPy doesn't return iteration count directly; estimate from info

From 9fa48153fa1b7e5ca66e9b635f2b4f94117f5dcb Mon Sep 17 00:00:00 2001
From: James Le Houx <james.lehoux@gre.ac.uk>
Date: Sat, 4 Apr 2026 12:03:03 +0000
Subject: [PATCH 2/3] Tutorials 02, 04, 07: install openimpala-cuda for C++ API
 sections

These three tutorials use the low-level C++ API (openimpala.core) or
oi.read_image() which require the compiled backend. Updated their
install cells to use openimpala-cuda from GitHub Releases.

Tutorials 01, 03, 05, 06 remain pure-Python (pip install openimpala).

https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk
---
 tutorials/02_digital_twin.ipynb          |  6 +++---
 tutorials/04_multiphase_and_fields.ipynb | 22 +++++++++++-----------
 tutorials/07_hpc_scaling.ipynb           |  8 ++++----
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/tutorials/02_digital_twin.ipynb b/tutorials/02_digital_twin.ipynb
index 5b51c438..d111938c 100644
--- a/tutorials/02_digital_twin.ipynb
+++ b/tutorials/02_digital_twin.ipynb
@@ -3,14 +3,14 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/02_digital_twin.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 2 of 7: From 3D Image to Device Model\n\n*OpenImpala Tutorial Series — From first solve to HPC deployment*\n\n---\n\nA common workflow in electrochemical research is to characterise a 3D microstructure from X-ray tomography and feed the resulting transport parameters into a continuum device model such as [PyBaMM](https://pybamm.org/). This tutorial walks through that pipeline end-to-end.\n\n**What you will learn:**\n1. Load a real 3D TIFF image stack.\n2. Compute directional tortuosity (X, Y, Z) to quantify anisotropy.\n3. Visualise the 3D diffusion field using the C++ core API and `yt`.\n4. Export results to the BPX (Battery Parameter eXchange) JSON format.\n5. Import those parameters into PyBaMM and run a discharge simulation.\n\n**Prerequisites:** [Tutorial 1](01_hello_openimpala.ipynb) — basic OpenImpala workflow (volume fraction, percolation, tortuosity)."
+   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/02_digital_twin.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 2 of 7: From 3D Image to Device Model\n\n*OpenImpala Tutorial Series \u2014 From first solve to HPC deployment*\n\n---\n\nA common workflow in electrochemical research is to characterise a 3D microstructure from X-ray tomography and feed the resulting transport parameters into a continuum device model such as [PyBaMM](https://pybamm.org/). This tutorial walks through that pipeline end-to-end.\n\n**What you will learn:**\n1. Load a real 3D TIFF image stack.\n2. Compute directional tortuosity (X, Y, Z) to quantify anisotropy.\n3. Visualise the 3D diffusion field using the C++ core API and `yt`.\n4. Export results to the BPX (Battery Parameter eXchange) JSON format.\n5. Import those parameters into PyBaMM and run a discharge simulation.\n\n**Prerequisites:** [Tutorial 1](01_hello_openimpala.ipynb) \u2014 basic OpenImpala workflow (volume fraction, percolation, tortuosity)."
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Install OpenImpala, PyBaMM, and visualization utilities\n!pip install -q openimpala pybamm bpx tifffile matplotlib yt"
+   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest pybamm bpx tifffile matplotlib yt"
   },
   {
    "cell_type": "code",
@@ -168,7 +168,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## Next Steps\n\nThis tutorial demonstrated the full pipeline from a 3D tomography image to a device-level simulation, with BPX as the interchange format.\n\n**Continue the series:**\n- [Tutorial 3: REV and Uncertainty](03_rev_and_uncertainty.ipynb) — How large does your sample need to be for statistically representative results?\n- [Tutorial 4: Effective Diffusivity and Field Visualisation](04_multiphase_and_fields.ipynb) — Extend to the cell-problem solver and multi-phase composites.\n- [Tutorial 7: Scaling to HPC](07_hpc_scaling.ipynb) — Analyse full-resolution synchrotron datasets with MPI.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **PyBaMM:** M. Sulzer et al., *Python Battery Mathematical Modelling (PyBaMM)*, J. Open Research Software 9(1), 14 (2021). [doi:10.5334/jors.309](https://doi.org/10.5334/jors.309)\n3. **BPX standard:** *Battery Parameter eXchange (BPX)*, an open standard for lithium-ion battery parameterisation. [GitHub](https://github.com/pybop-team/BPX)\n4. **Electrode tortuosity & anisotropy:** M. Ebner et al., *Tortuosity anisotropy in lithium-ion battery electrodes*, Advanced Energy Materials 4(5), 1301278 (2014). [doi:10.1002/aenm.201301278](https://doi.org/10.1002/aenm.201301278)\n5. **yt visualisation:** M. J. Turk et al., *yt: A multi-code analysis toolkit for astrophysical simulation data*, Astrophys. J. Suppl. 192, 9 (2011). [doi:10.1088/0067-0049/192/1/9](https://doi.org/10.1088/0067-0049/192/1/9)\n6. **Digital twin concept for batteries:** A. A. Franco et al., *Boosting rechargeable batteries R&D by multiscale modeling: myth or reality?*, Chemical Reviews 119(7), 4569–4627 (2019). [doi:10.1021/acs.chemrev.8b00239](https://doi.org/10.1021/acs.chemrev.8b00239)"
+   "source": "## Next Steps\n\nThis tutorial demonstrated the full pipeline from a 3D tomography image to a device-level simulation, with BPX as the interchange format.\n\n**Continue the series:**\n- [Tutorial 3: REV and Uncertainty](03_rev_and_uncertainty.ipynb) \u2014 How large does your sample need to be for statistically representative results?\n- [Tutorial 4: Effective Diffusivity and Field Visualisation](04_multiphase_and_fields.ipynb) \u2014 Extend to the cell-problem solver and multi-phase composites.\n- [Tutorial 7: Scaling to HPC](07_hpc_scaling.ipynb) \u2014 Analyse full-resolution synchrotron datasets with MPI.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **PyBaMM:** M. Sulzer et al., *Python Battery Mathematical Modelling (PyBaMM)*, J. Open Research Software 9(1), 14 (2021). [doi:10.5334/jors.309](https://doi.org/10.5334/jors.309)\n3. **BPX standard:** *Battery Parameter eXchange (BPX)*, an open standard for lithium-ion battery parameterisation. [GitHub](https://github.com/pybop-team/BPX)\n4. **Electrode tortuosity & anisotropy:** M. Ebner et al., *Tortuosity anisotropy in lithium-ion battery electrodes*, Advanced Energy Materials 4(5), 1301278 (2014). [doi:10.1002/aenm.201301278](https://doi.org/10.1002/aenm.201301278)\n5. **yt visualisation:** M. J. Turk et al., *yt: A multi-code analysis toolkit for astrophysical simulation data*, Astrophys. J. Suppl. 192, 9 (2011). [doi:10.1088/0067-0049/192/1/9](https://doi.org/10.1088/0067-0049/192/1/9)\n6. **Digital twin concept for batteries:** A. A. Franco et al., *Boosting rechargeable batteries R&D by multiscale modeling: myth or reality?*, Chemical Reviews 119(7), 4569\u20134627 (2019). [doi:10.1021/acs.chemrev.8b00239](https://doi.org/10.1021/acs.chemrev.8b00239)"
   }
  ],
  "metadata": {
diff --git a/tutorials/04_multiphase_and_fields.ipynb b/tutorials/04_multiphase_and_fields.ipynb
index 0e4c463e..2b41eb83 100644
--- a/tutorials/04_multiphase_and_fields.ipynb
+++ b/tutorials/04_multiphase_and_fields.ipynb
@@ -3,14 +3,14 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/04_multiphase_and_fields.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 4 of 7: Multi-Phase Transport and Field Visualisation\n\n*OpenImpala Tutorial Series — From first solve to HPC deployment*\n\n---\n\nOpenImpala provides two approaches to computing transport properties:\n\n1. **Tortuosity solver** (`TortuosityHypre`) — Solves $\\nabla \\cdot (D\\,\\nabla\\varphi) = 0$ with Dirichlet boundary conditions and computes $\\tau$ from the resulting flux. This is what the high-level `oi.tortuosity()` function uses.\n2. **Effective diffusivity solver** (`EffectiveDiffusivityHypre`) — Solves the periodic cell problem $\\nabla \\cdot (D\\,\\nabla\\chi) = -\\nabla \\cdot (D\\,\\hat{e})$ to compute the effective diffusivity tensor via homogenisation.\n\nBoth give consistent results on binary structures, but the cell-problem approach generalises naturally to multi-phase composites with spatially varying $D(\\mathbf{x})$.\n\nThis tutorial covers both the effective diffusivity solver *and* multi-phase transport — the ability to assign different diffusion coefficients to different material phases, which is essential for real composite materials like battery electrodes with a carbon-binder domain (CBD).\n\n**What you will learn:**\n1. Use the `openimpala.core` module directly (lower-level than `oi.tortuosity()`).\n2. Solve the effective diffusivity cell problem on a binary structure.\n3. Visualise the corrector field using AMReX plotfiles and `yt`.\n4. Build and run a multi-phase transport problem with analytically known results.\n5. Construct a 3-phase battery electrode geometry and configure per-phase diffusivities.\n\n**Prerequisites:** [Tutorial 1](01_hello_openimpala.ipynb) for the high-level API. [Tutorial 2](02_digital_twin.ipynb) introduced `openimpala.core` and `yt` visualisation — we build on that here."
+   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/04_multiphase_and_fields.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 4 of 7: Multi-Phase Transport and Field Visualisation\n\n*OpenImpala Tutorial Series \u2014 From first solve to HPC deployment*\n\n---\n\nOpenImpala provides two approaches to computing transport properties:\n\n1. **Tortuosity solver** (`TortuosityHypre`) \u2014 Solves $\\nabla \\cdot (D\\,\\nabla\\varphi) = 0$ with Dirichlet boundary conditions and computes $\\tau$ from the resulting flux. This is what the high-level `oi.tortuosity()` function uses.\n2. **Effective diffusivity solver** (`EffectiveDiffusivityHypre`) \u2014 Solves the periodic cell problem $\\nabla \\cdot (D\\,\\nabla\\chi) = -\\nabla \\cdot (D\\,\\hat{e})$ to compute the effective diffusivity tensor via homogenisation.\n\nBoth give consistent results on binary structures, but the cell-problem approach generalises naturally to multi-phase composites with spatially varying $D(\\mathbf{x})$.\n\nThis tutorial covers both the effective diffusivity solver *and* multi-phase transport \u2014 the ability to assign different diffusion coefficients to different material phases, which is essential for real composite materials like battery electrodes with a carbon-binder domain (CBD).\n\n**What you will learn:**\n1. Use the `openimpala.core` module directly (lower-level than `oi.tortuosity()`).\n2. Solve the effective diffusivity cell problem on a binary structure.\n3. Visualise the corrector field using AMReX plotfiles and `yt`.\n4. Build and run a multi-phase transport problem with analytically known results.\n5. Construct a 3-phase battery electrode geometry and configure per-phase diffusivities.\n\n**Prerequisites:** [Tutorial 1](01_hello_openimpala.ipynb) for the high-level API. [Tutorial 2](02_digital_twin.ipynb) introduced `openimpala.core` and `yt` visualisation \u2014 we build on that here."
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Install OpenImpala, PoreSpy, yt (for AMReX plotfile visualisation), and Matplotlib.\n!pip install -q openimpala porespy yt matplotlib"
+   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest porespy yt matplotlib"
   },
   {
    "cell_type": "code",
@@ -34,7 +34,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 2. Solving with the Core API\n\nThe high-level `oi.tortuosity()` function wraps several steps (VoxelImage creation, percolation check, volume fraction, solver construction). Here we perform those steps explicitly using `openimpala.core`, which gives more control — for example, enabling plotfile output.\n\nWe solve the **effective diffusivity cell problem** in the Z-direction with `write_plotfile=True` so we can visualise the corrector field afterwards."
+   "source": "## 2. Solving with the Core API\n\nThe high-level `oi.tortuosity()` function wraps several steps (VoxelImage creation, percolation check, volume fraction, solver construction). Here we perform those steps explicitly using `openimpala.core`, which gives more control \u2014 for example, enabling plotfile output.\n\nWe solve the **effective diffusivity cell problem** in the Z-direction with `write_plotfile=True` so we can visualise the corrector field afterwards."
   },
   {
    "cell_type": "code",
@@ -46,30 +46,30 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 3. Visualising the Corrector Field\n\nThe effective diffusivity solver writes the corrector field $\\chi_z$ to an AMReX plotfile. We load it with [yt](https://yt-project.org/) and plot a slice. Regions where the field deviates strongly from zero indicate where the microstructure forces the diffusion flux to deviate from the imposed direction — this is the physical origin of tortuosity."
+   "source": "## 3. Visualising the Corrector Field\n\nThe effective diffusivity solver writes the corrector field $\\chi_z$ to an AMReX plotfile. We load it with [yt](https://yt-project.org/) and plot a slice. Regions where the field deviates strongly from zero indicate where the microstructure forces the diffusion flux to deviate from the imposed direction \u2014 this is the physical origin of tortuosity."
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Find the generated plotfile directory\nplotfile_dirs = [d for d in glob.glob(f\"{out_dir}/*\") if os.path.isdir(d)]\nif plotfile_dirs:\n    latest_plotfile = sorted(plotfile_dirs)[-1]\n    print(f\"Loading plotfile: {latest_plotfile}\")\n\n    yt.funcs.mylog.setLevel(40)  # Suppress verbose yt logging\n    ds = yt.load(latest_plotfile)\n\n    # List available fields to find the corrector field name\n    field_list = [f for f in ds.field_list if f[0] == \"boxlib\"]\n    print(f\"Available fields: {field_list}\")\n\n    # Plot a slice through the Y-midplane\n    field_name = field_list[0]  # Use the first available boxlib field\n    slc = yt.SlicePlot(ds, normal=\"y\", fields=field_name)\n    slc.set_log(field_name, False)\n    slc.set_cmap(field_name, \"RdBu_r\")\n    slc.annotate_title(f\"Corrector Field (Z-direction)\")\n    slc.show()\nelse:\n    print(\"No plotfile found — check that write_plotfile=True was set.\")"
+   "source": "# Find the generated plotfile directory\nplotfile_dirs = [d for d in glob.glob(f\"{out_dir}/*\") if os.path.isdir(d)]\nif plotfile_dirs:\n    latest_plotfile = sorted(plotfile_dirs)[-1]\n    print(f\"Loading plotfile: {latest_plotfile}\")\n\n    yt.funcs.mylog.setLevel(40)  # Suppress verbose yt logging\n    ds = yt.load(latest_plotfile)\n\n    # List available fields to find the corrector field name\n    field_list = [f for f in ds.field_list if f[0] == \"boxlib\"]\n    print(f\"Available fields: {field_list}\")\n\n    # Plot a slice through the Y-midplane\n    field_name = field_list[0]  # Use the first available boxlib field\n    slc = yt.SlicePlot(ds, normal=\"y\", fields=field_name)\n    slc.set_log(field_name, False)\n    slc.set_cmap(field_name, \"RdBu_r\")\n    slc.annotate_title(f\"Corrector Field (Z-direction)\")\n    slc.show()\nelse:\n    print(\"No plotfile found \u2014 check that write_plotfile=True was set.\")"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 4. Multi-Phase Transport: Worked Example\n\nReal microstructures often contain more than two phases. A lithium-ion battery cathode, for example, has:\n- **Pore** ($D = D_\\text{bulk}$) — the electrolyte-filled void\n- **Carbon-binder domain (CBD)** ($D \\approx 0.1\\,D_\\text{bulk}$) — partially conducting\n- **Active material** ($D = 0$) — impermeable solid\n\nOpenImpala's multi-phase mode assigns a different diffusion coefficient to each phase via the `tortuosity.active_phases` and `tortuosity.phase_diffusivities` parameters. The HYPRE matrix fill then uses the harmonic mean of adjacent cell coefficients at each face:\n\n$$D_\\text{face} = \\frac{2\\,D_\\text{left}\\,D_\\text{right}}{D_\\text{left} + D_\\text{right}}$$\n\nThis is physically correct (series resistance across an interface) and ensures that a zero-diffusivity phase creates an impermeable boundary.\n\n### Analytical Benchmarks\n\nWe validate multi-phase transport using two classic composite geometries with known analytical solutions on a discrete $N$-cell grid:\n\n**Series layers** (layers perpendicular to flow) — the Reuss / harmonic bound:\n$$\\tau_\\text{series} = \\frac{(N-1)(D_0 + D_1)}{2N \\cdot D_0 \\cdot D_1}$$\n\n**Parallel layers** (layers parallel to flow) — the Voigt / arithmetic bound:\n$$\\tau_\\text{parallel} = \\frac{2(N-1)}{N(D_0 + D_1)}$$\n\nThese provide exact targets for validating the multi-phase solver."
+   "source": "## 4. Multi-Phase Transport: Worked Example\n\nReal microstructures often contain more than two phases. A lithium-ion battery cathode, for example, has:\n- **Pore** ($D = D_\\text{bulk}$) \u2014 the electrolyte-filled void\n- **Carbon-binder domain (CBD)** ($D \\approx 0.1\\,D_\\text{bulk}$) \u2014 partially conducting\n- **Active material** ($D = 0$) \u2014 impermeable solid\n\nOpenImpala's multi-phase mode assigns a different diffusion coefficient to each phase via the `tortuosity.active_phases` and `tortuosity.phase_diffusivities` parameters. The HYPRE matrix fill then uses the harmonic mean of adjacent cell coefficients at each face:\n\n$$D_\\text{face} = \\frac{2\\,D_\\text{left}\\,D_\\text{right}}{D_\\text{left} + D_\\text{right}}$$\n\nThis is physically correct (series resistance across an interface) and ensures that a zero-diffusivity phase creates an impermeable boundary.\n\n### Analytical Benchmarks\n\nWe validate multi-phase transport using two classic composite geometries with known analytical solutions on a discrete $N$-cell grid:\n\n**Series layers** (layers perpendicular to flow) \u2014 the Reuss / harmonic bound:\n$$\\tau_\\text{series} = \\frac{(N-1)(D_0 + D_1)}{2N \\cdot D_0 \\cdot D_1}$$\n\n**Parallel layers** (layers parallel to flow) \u2014 the Voigt / arithmetic bound:\n$$\\tau_\\text{parallel} = \\frac{2(N-1)}{N(D_0 + D_1)}$$\n\nThese provide exact targets for validating the multi-phase solver."
   },
   {
    "cell_type": "code",
-   "source": "# --- Build synthetic multi-phase structures ---\n# Alternating layers of phase 0 (D=1.0) and phase 1 (D=0.5)\n\nN_mp = 32\nD0, D1 = 1.0, 0.5\n\n# Series: layers perpendicular to flow (X-direction)\nseries = np.zeros((N_mp, N_mp, N_mp), dtype=np.int32)\nfor i in range(N_mp):\n    series[:, :, i] = i % 2  # alternating layers along X\n\n# Parallel: layers parallel to flow (Y-direction layering, X flow)\nparallel = np.zeros((N_mp, N_mp, N_mp), dtype=np.int32)\nfor j in range(N_mp):\n    parallel[:, j, :] = j % 2  # alternating layers along Y\n\n# Visualise both geometries\nfig, axes = plt.subplots(1, 2, figsize=(10, 4), dpi=120)\ncmap = plt.cm.colors.ListedColormap(['#4ECDC4', '#FF6B6B'])\n\nax = axes[0]\nax.imshow(series[N_mp//2, :, :], cmap=cmap, interpolation='nearest')\nax.set_title(\"Series layers\\n(perpendicular to X flow)\")\nax.set_xlabel(\"X (flow direction) →\")\nax.set_ylabel(\"Y\")\nax.set_aspect('equal')\n\nax = axes[1]\nax.imshow(parallel[N_mp//2, :, :], cmap=cmap, interpolation='nearest')\nax.set_title(\"Parallel layers\\n(parallel to X flow)\")\nax.set_xlabel(\"X (flow direction) →\")\nax.set_ylabel(\"Y\")\nax.set_aspect('equal')\n\n# Legend\nfrom matplotlib.patches import Patch\nlegend_elements = [Patch(facecolor='#4ECDC4', label=f'Phase 0 (D={D0})'),\n                   Patch(facecolor='#FF6B6B', label=f'Phase 1 (D={D1})')]\nfig.legend(handles=legend_elements, loc='lower center', ncol=2, fontsize=10,\n           bbox_to_anchor=(0.5, -0.02))\nplt.tight_layout()\nplt.show()\n\nprint(f\"Grid size: {N_mp}^3\")\nprint(f\"Phase diffusivities: D0={D0}, D1={D1}\")",
+   "source": "# --- Build synthetic multi-phase structures ---\n# Alternating layers of phase 0 (D=1.0) and phase 1 (D=0.5)\n\nN_mp = 32\nD0, D1 = 1.0, 0.5\n\n# Series: layers perpendicular to flow (X-direction)\nseries = np.zeros((N_mp, N_mp, N_mp), dtype=np.int32)\nfor i in range(N_mp):\n    series[:, :, i] = i % 2  # alternating layers along X\n\n# Parallel: layers parallel to flow (Y-direction layering, X flow)\nparallel = np.zeros((N_mp, N_mp, N_mp), dtype=np.int32)\nfor j in range(N_mp):\n    parallel[:, j, :] = j % 2  # alternating layers along Y\n\n# Visualise both geometries\nfig, axes = plt.subplots(1, 2, figsize=(10, 4), dpi=120)\ncmap = plt.cm.colors.ListedColormap(['#4ECDC4', '#FF6B6B'])\n\nax = axes[0]\nax.imshow(series[N_mp//2, :, :], cmap=cmap, interpolation='nearest')\nax.set_title(\"Series layers\\n(perpendicular to X flow)\")\nax.set_xlabel(\"X (flow direction) \u2192\")\nax.set_ylabel(\"Y\")\nax.set_aspect('equal')\n\nax = axes[1]\nax.imshow(parallel[N_mp//2, :, :], cmap=cmap, interpolation='nearest')\nax.set_title(\"Parallel layers\\n(parallel to X flow)\")\nax.set_xlabel(\"X (flow direction) \u2192\")\nax.set_ylabel(\"Y\")\nax.set_aspect('equal')\n\n# Legend\nfrom matplotlib.patches import Patch\nlegend_elements = [Patch(facecolor='#4ECDC4', label=f'Phase 0 (D={D0})'),\n                   Patch(facecolor='#FF6B6B', label=f'Phase 1 (D={D1})')]\nfig.legend(handles=legend_elements, loc='lower center', ncol=2, fontsize=10,\n           bbox_to_anchor=(0.5, -0.02))\nplt.tight_layout()\nplt.show()\n\nprint(f\"Grid size: {N_mp}^3\")\nprint(f\"Phase diffusivities: D0={D0}, D1={D1}\")",
    "metadata": {},
    "execution_count": null,
    "outputs": []
   },
   {
    "cell_type": "markdown",
-   "source": "### Running the Multi-Phase Solver\n\nMulti-phase transport is configured via AMReX's `ParmParse` parameter database. When using the C++ executable (or the `openimpala.core` API), you specify which phases are active and their diffusion coefficients in an **inputs file**:\n\n```\ntortuosity.active_phases = 0 1\ntortuosity.phase_diffusivities = 1.0 0.5\n```\n\nBelow we generate inputs files for both the series and parallel geometries, write the phase data as a RAW binary file, and run the solver. This is the same workflow you would use on an HPC cluster — the only difference is the `mpirun` prefix.",
+   "source": "### Running the Multi-Phase Solver\n\nMulti-phase transport is configured via AMReX's `ParmParse` parameter database. When using the C++ executable (or the `openimpala.core` API), you specify which phases are active and their diffusion coefficients in an **inputs file**:\n\n```\ntortuosity.active_phases = 0 1\ntortuosity.phase_diffusivities = 1.0 0.5\n```\n\nBelow we generate inputs files for both the series and parallel geometries, write the phase data as a RAW binary file, and run the solver. This is the same workflow you would use on an HPC cluster \u2014 the only difference is the `mpirun` prefix.",
    "metadata": {}
   },
   {
@@ -86,14 +86,14 @@
   },
   {
    "cell_type": "code",
-   "source": "# --- Visualise the bounds as a function of D1/D0 ratio ---\nratios = np.linspace(0.01, 1.0, 200)\ntau_uniform = (N_mp - 1) / N_mp\n\ntau_s = (N_mp - 1) * (1.0 + ratios) / (2 * N_mp * 1.0 * ratios)\ntau_p = 2 * (N_mp - 1) / (N_mp * (1.0 + ratios))\n\nfig, ax = plt.subplots(figsize=(8, 5), dpi=120)\nax.fill_between(ratios, tau_p, tau_s, alpha=0.15, color='#1f77b4',\n                label='Feasible region')\nax.plot(ratios, tau_s, '-', color='#d62728', lw=2, label='Series (Reuss bound)')\nax.plot(ratios, tau_p, '-', color='#2ca02c', lw=2, label='Parallel (Voigt bound)')\nax.axhline(tau_uniform, color='gray', ls=':', lw=1, label=f'Uniform (tau={(N_mp-1)/N_mp:.4f})')\n\n# Mark the specific case D1/D0 = 0.5\nax.plot(D1/D0, tau_series_exact, 'o', color='#d62728', ms=10, zorder=5)\nax.plot(D1/D0, tau_parallel_exact, 'o', color='#2ca02c', ms=10, zorder=5)\nax.annotate(f'Series: tau={tau_series_exact:.4f}',\n            xy=(D1/D0, tau_series_exact), xytext=(0.3, tau_series_exact + 0.15),\n            arrowprops=dict(arrowstyle='->', color='#d62728'), fontsize=9, color='#d62728')\nax.annotate(f'Parallel: tau={tau_parallel_exact:.4f}',\n            xy=(D1/D0, tau_parallel_exact), xytext=(0.3, tau_parallel_exact - 0.12),\n            arrowprops=dict(arrowstyle='->', color='#2ca02c'), fontsize=9, color='#2ca02c')\n\nax.set_xlabel(r'$D_1 / D_0$', fontsize=12)\nax.set_ylabel(r'Tortuosity $\\tau$', fontsize=12)\nax.set_title(f'Reuss–Voigt Bounds for Two-Phase Composite (N={N_mp})', fontweight='bold')\nax.legend(loc='upper right', fontsize=9)\nax.set_xlim(0, 1.05)\nax.set_ylim(0, max(tau_s) * 1.1)\nax.grid(True, alpha=0.3)\nax.spines['top'].set_visible(False)\nax.spines['right'].set_visible(False)\nplt.tight_layout()\nplt.show()\n\nprint(f\"\\nFor D1/D0 = {D1/D0}:\")\nprint(f\"  Series bound:   tau = {tau_series_exact:.6f}\")\nprint(f\"  Parallel bound: tau = {tau_parallel_exact:.6f}\")\nprint(f\"  Ratio (series/parallel): {tau_series_exact/tau_parallel_exact:.2f}x\")",
+   "source": "# --- Visualise the bounds as a function of D1/D0 ratio ---\nratios = np.linspace(0.01, 1.0, 200)\ntau_uniform = (N_mp - 1) / N_mp\n\ntau_s = (N_mp - 1) * (1.0 + ratios) / (2 * N_mp * 1.0 * ratios)\ntau_p = 2 * (N_mp - 1) / (N_mp * (1.0 + ratios))\n\nfig, ax = plt.subplots(figsize=(8, 5), dpi=120)\nax.fill_between(ratios, tau_p, tau_s, alpha=0.15, color='#1f77b4',\n                label='Feasible region')\nax.plot(ratios, tau_s, '-', color='#d62728', lw=2, label='Series (Reuss bound)')\nax.plot(ratios, tau_p, '-', color='#2ca02c', lw=2, label='Parallel (Voigt bound)')\nax.axhline(tau_uniform, color='gray', ls=':', lw=1, label=f'Uniform (tau={(N_mp-1)/N_mp:.4f})')\n\n# Mark the specific case D1/D0 = 0.5\nax.plot(D1/D0, tau_series_exact, 'o', color='#d62728', ms=10, zorder=5)\nax.plot(D1/D0, tau_parallel_exact, 'o', color='#2ca02c', ms=10, zorder=5)\nax.annotate(f'Series: tau={tau_series_exact:.4f}',\n            xy=(D1/D0, tau_series_exact), xytext=(0.3, tau_series_exact + 0.15),\n            arrowprops=dict(arrowstyle='->', color='#d62728'), fontsize=9, color='#d62728')\nax.annotate(f'Parallel: tau={tau_parallel_exact:.4f}',\n            xy=(D1/D0, tau_parallel_exact), xytext=(0.3, tau_parallel_exact - 0.12),\n            arrowprops=dict(arrowstyle='->', color='#2ca02c'), fontsize=9, color='#2ca02c')\n\nax.set_xlabel(r'$D_1 / D_0$', fontsize=12)\nax.set_ylabel(r'Tortuosity $\\tau$', fontsize=12)\nax.set_title(f'Reuss\u2013Voigt Bounds for Two-Phase Composite (N={N_mp})', fontweight='bold')\nax.legend(loc='upper right', fontsize=9)\nax.set_xlim(0, 1.05)\nax.set_ylim(0, max(tau_s) * 1.1)\nax.grid(True, alpha=0.3)\nax.spines['top'].set_visible(False)\nax.spines['right'].set_visible(False)\nplt.tight_layout()\nplt.show()\n\nprint(f\"\\nFor D1/D0 = {D1/D0}:\")\nprint(f\"  Series bound:   tau = {tau_series_exact:.6f}\")\nprint(f\"  Parallel bound: tau = {tau_parallel_exact:.6f}\")\nprint(f\"  Ratio (series/parallel): {tau_series_exact/tau_parallel_exact:.2f}x\")",
    "metadata": {},
    "execution_count": null,
    "outputs": []
   },
   {
    "cell_type": "markdown",
-   "source": "### Extension: Three-Phase Battery Electrode\n\nA realistic battery electrode has three phases. Below we construct a synthetic 3-phase microstructure and write the corresponding inputs file. The key insight is that phases with $D = 0$ are automatically excluded from the linear system — their matrix rows become $A_{ii} = 1$, $\\text{rhs}_i = 0$, decoupling them completely.\n\nThis means you can model:\n- **Pore** (phase 0): $D = 1.0$ — full electrolyte diffusivity\n- **CBD** (phase 1): $D = 0.1$ — reduced diffusivity through carbon-binder\n- **Active material** (phase 2): $D = 0$ — impermeable solid particle",
+   "source": "### Extension: Three-Phase Battery Electrode\n\nA realistic battery electrode has three phases. Below we construct a synthetic 3-phase microstructure and write the corresponding inputs file. The key insight is that phases with $D = 0$ are automatically excluded from the linear system \u2014 their matrix rows become $A_{ii} = 1$, $\\text{rhs}_i = 0$, decoupling them completely.\n\nThis means you can model:\n- **Pore** (phase 0): $D = 1.0$ \u2014 full electrolyte diffusivity\n- **CBD** (phase 1): $D = 0.1$ \u2014 reduced diffusivity through carbon-binder\n- **Active material** (phase 2): $D = 0$ \u2014 impermeable solid particle",
    "metadata": {}
   },
   {
@@ -105,7 +105,7 @@
   },
   {
    "cell_type": "markdown",
-   "source": "## Next Steps\n\nThis tutorial demonstrated the effective diffusivity cell-problem solver, field visualisation, and multi-phase transport with analytically validated benchmarks. The key points:\n\n- **Effective diffusivity** via homogenisation gives the full $D_\\text{eff}$ tensor, not just scalar tortuosity.\n- **Multi-phase transport** assigns per-phase diffusion coefficients via `tortuosity.active_phases` and `tortuosity.phase_diffusivities` in the inputs file.\n- **Analytical validation** against Reuss (series) and Voigt (parallel) bounds confirms the harmonic mean face coefficient implementation.\n- **Three-phase structures** (pore/CBD/solid) are handled naturally — phases with $D=0$ are decoupled automatically.\n\nFor the test inputs used in OpenImpala's CI/CD regression suite, see `tests/inputs/tMultiPhaseTransport*.inputs`.\n\n**Continue the series:**\n- [Tutorial 5: Surrogate Modelling](05_surrogate_modelling.ipynb) — Generate labelled datasets for machine learning.\n- [Tutorial 6: Topology Optimisation](06_topology_optimisation.ipynb) — Use OpenImpala as a cost-function evaluator in an optimisation loop.\n- [Tutorial 7: Scaling to HPC](07_hpc_scaling.ipynb) — Run on larger datasets with MPI.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **Homogenisation theory:** S. Torquato, *Random Heterogeneous Materials: Microstructure and Macroscopic Properties*, Springer (2002). Chapter 17 covers the cell problem for effective conductivity.\n3. **Reuss and Voigt bounds:** W. Voigt, *Lehrbuch der Kristallphysik*, Teubner (1928); A. Reuss, *Berechnung der Fliessgrenze von Mischkristallen*, Z. Angew. Math. Mech. 9, 49–58 (1929). The harmonic (Reuss) and arithmetic (Voigt) means give the tightest possible bounds without geometric information.\n4. **Effective diffusivity in electrodes:** L. Holzer et al., *Microstructure–property relationships in a gas diffusion layer (GDL) for Polymer Electrolyte Fuel Cells, Part I: effect of compression and anisotropy of dry GDL*, Electrochimica Acta 227, 419–434 (2017). [doi:10.1016/j.electacta.2017.01.030](https://doi.org/10.1016/j.electacta.2017.01.030)\n5. **Carbon-binder domain modelling:** F. L. E. Usseglio-Viretta et al., *Resolving the discrepancy in tortuosity factor estimation for Li-ion battery electrodes through micro-macro modeling and experiment*, J. Electrochem. Soc. 165(14), A3403 (2018). [doi:10.1149/2.0731814jes](https://doi.org/10.1149/2.0731814jes)\n6. **yt for AMReX data:** M. J. Turk et al., *yt: A multi-code analysis toolkit for astrophysical simulation data*, Astrophys. J. Suppl. 192, 9 (2011). [doi:10.1088/0067-0049/192/1/9](https://doi.org/10.1088/0067-0049/192/1/9)",
+   "source": "## Next Steps\n\nThis tutorial demonstrated the effective diffusivity cell-problem solver, field visualisation, and multi-phase transport with analytically validated benchmarks. The key points:\n\n- **Effective diffusivity** via homogenisation gives the full $D_\\text{eff}$ tensor, not just scalar tortuosity.\n- **Multi-phase transport** assigns per-phase diffusion coefficients via `tortuosity.active_phases` and `tortuosity.phase_diffusivities` in the inputs file.\n- **Analytical validation** against Reuss (series) and Voigt (parallel) bounds confirms the harmonic mean face coefficient implementation.\n- **Three-phase structures** (pore/CBD/solid) are handled naturally \u2014 phases with $D=0$ are decoupled automatically.\n\nFor the test inputs used in OpenImpala's CI/CD regression suite, see `tests/inputs/tMultiPhaseTransport*.inputs`.\n\n**Continue the series:**\n- [Tutorial 5: Surrogate Modelling](05_surrogate_modelling.ipynb) \u2014 Generate labelled datasets for machine learning.\n- [Tutorial 6: Topology Optimisation](06_topology_optimisation.ipynb) \u2014 Use OpenImpala as a cost-function evaluator in an optimisation loop.\n- [Tutorial 7: Scaling to HPC](07_hpc_scaling.ipynb) \u2014 Run on larger datasets with MPI.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **Homogenisation theory:** S. Torquato, *Random Heterogeneous Materials: Microstructure and Macroscopic Properties*, Springer (2002). Chapter 17 covers the cell problem for effective conductivity.\n3. **Reuss and Voigt bounds:** W. Voigt, *Lehrbuch der Kristallphysik*, Teubner (1928); A. Reuss, *Berechnung der Fliessgrenze von Mischkristallen*, Z. Angew. Math. Mech. 9, 49\u201358 (1929). The harmonic (Reuss) and arithmetic (Voigt) means give the tightest possible bounds without geometric information.\n4. **Effective diffusivity in electrodes:** L. Holzer et al., *Microstructure\u2013property relationships in a gas diffusion layer (GDL) for Polymer Electrolyte Fuel Cells, Part I: effect of compression and anisotropy of dry GDL*, Electrochimica Acta 227, 419\u2013434 (2017). [doi:10.1016/j.electacta.2017.01.030](https://doi.org/10.1016/j.electacta.2017.01.030)\n5. **Carbon-binder domain modelling:** F. L. E. Usseglio-Viretta et al., *Resolving the discrepancy in tortuosity factor estimation for Li-ion battery electrodes through micro-macro modeling and experiment*, J. Electrochem. Soc. 165(14), A3403 (2018). [doi:10.1149/2.0731814jes](https://doi.org/10.1149/2.0731814jes)\n6. **yt for AMReX data:** M. J. Turk et al., *yt: A multi-code analysis toolkit for astrophysical simulation data*, Astrophys. J. Suppl. 192, 9 (2011). [doi:10.1088/0067-0049/192/1/9](https://doi.org/10.1088/0067-0049/192/1/9)",
    "metadata": {}
   }
  ],
diff --git a/tutorials/07_hpc_scaling.ipynb b/tutorials/07_hpc_scaling.ipynb
index 7c3cb77f..608d2256 100644
--- a/tutorials/07_hpc_scaling.ipynb
+++ b/tutorials/07_hpc_scaling.ipynb
@@ -3,7 +3,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/07_hpc_scaling.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 7 of 7: Scaling from Laptop to HPC\n\n*OpenImpala Tutorial Series — From first solve to HPC deployment*\n\n---\n\nThe previous tutorials all ran on a single machine. Real synchrotron datasets, however, can be $2000^3$ voxels or larger — too big to fit in a single node's memory, let alone solve interactively.\n\nOpenImpala is built on AMReX and MPI, so the same Python script you wrote in earlier tutorials can be launched with `mpirun` across many nodes. For very large files, OpenImpala's C++ readers can also bypass Python entirely, reading TIFF and HDF5 data in parallel directly from disk.\n\nThis tutorial combines **reference material** (MPI scripts, SLURM templates, reader tables) with **executable benchmarks** you can run right now — a problem-size scaling study and a solver comparison that produces real timing data.\n\n**What you will learn:**\n1. How OpenImpala distributes work across MPI ranks.\n2. The role of `max_grid_size` in domain decomposition.\n3. A complete MPI-ready Python script.\n4. How to use the C++ parallel readers for out-of-core I/O.\n5. A ready-to-use SLURM batch submission script.\n6. How solve time scales with problem size (executable benchmark).\n7. Which HYPRE solver to choose for your problem (executable comparison).\n\n**Prerequisites:** Familiarity with the OpenImpala Python API ([Tutorials 1-2](01_hello_openimpala.ipynb)) and access to a multi-core machine or HPC cluster. All earlier tutorials in the series can inform what scripts you deploy at scale."
+   "source": "<a href=\"https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/master/tutorials/07_hpc_scaling.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# Tutorial 7 of 7: Scaling from Laptop to HPC\n\n*OpenImpala Tutorial Series \u2014 From first solve to HPC deployment*\n\n---\n\nThe previous tutorials all ran on a single machine. Real synchrotron datasets, however, can be $2000^3$ voxels or larger \u2014 too big to fit in a single node's memory, let alone solve interactively.\n\nOpenImpala is built on AMReX and MPI, so the same Python script you wrote in earlier tutorials can be launched with `mpirun` across many nodes. For very large files, OpenImpala's C++ readers can also bypass Python entirely, reading TIFF and HDF5 data in parallel directly from disk.\n\nThis tutorial combines **reference material** (MPI scripts, SLURM templates, reader tables) with **executable benchmarks** you can run right now \u2014 a problem-size scaling study and a solver comparison that produces real timing data.\n\n**What you will learn:**\n1. How OpenImpala distributes work across MPI ranks.\n2. The role of `max_grid_size` in domain decomposition.\n3. A complete MPI-ready Python script.\n4. How to use the C++ parallel readers for out-of-core I/O.\n5. A ready-to-use SLURM batch submission script.\n6. How solve time scales with problem size (executable benchmark).\n7. Which HYPRE solver to choose for your problem (executable comparison).\n\n**Prerequisites:** Familiarity with the OpenImpala Python API ([Tutorials 1-2](01_hello_openimpala.ipynb)) and access to a multi-core machine or HPC cluster. All earlier tutorials in the series can inform what scripts you deploy at scale."
   },
   {
    "cell_type": "markdown",
@@ -235,14 +235,14 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 6. Executable Scaling Study: Problem Size vs. Wall Time\n\nEven on a single machine, we can measure how solve time scales with problem size. This is a **weak-scaling proxy** — we increase the domain while keeping one rank, which shows the computational complexity of the solver. On a real cluster with proportionally more MPI ranks, wall time would stay roughly constant (ideal weak scaling).\n\nThis cell actually runs and produces real timing data."
+   "source": "## 6. Executable Scaling Study: Problem Size vs. Wall Time\n\nEven on a single machine, we can measure how solve time scales with problem size. This is a **weak-scaling proxy** \u2014 we increase the domain while keeping one rank, which shows the computational complexity of the solver. On a real cluster with proportionally more MPI ranks, wall time would stay roughly constant (ideal weak scaling).\n\nThis cell actually runs and produces real timing data."
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "!pip install -q openimpala porespy matplotlib"
+   "source": "# This tutorial demonstrates HPC features that require the compiled C++ backend\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest porespy matplotlib"
   },
   {
    "cell_type": "code",
@@ -266,7 +266,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## Summary\n\n| Scenario | Approach |\n|----------|----------|\n| **Laptop / Colab** | `pip install openimpala`, use NumPy arrays directly |\n| **Small cluster (1-16 cores)** | `mpirun -np 16 python script.py` with NumPy loading |\n| **Large cluster (16-128+ cores)** | `mpirun -np 128 python script.py` with `oi.read_image()` for parallel I/O |\n| **HPC without Python** | `mpirun -np 128 Diffusion ./inputs` (pure C++ application) |\n\n### Solver Quick Reference\n\n| Solver | Best For | Notes |\n|--------|----------|-------|\n| **FlexGMRES** | General use | Robust default, handles non-symmetric systems |\n| **PCG** | Symmetric problems | Fastest when applicable |\n| **SMG/PFMG** | Structured grids | Geometric multigrid, excellent on regular domains |\n| **BiCGSTAB** | Non-symmetric | Alternative to GMRES |\n\nThe API is the same at every scale — only the launch command changes. The scaling study and solver comparison in Sections 6-7 provide concrete data to guide your deployment decisions.\n\n**Back to the beginning:**\n- [Tutorial 1: Getting Started](01_hello_openimpala.ipynb) — Core workflow with synthetic microstructures.\n- [Tutorial 2: From 3D Image to Device Model](02_digital_twin.ipynb) — Load real CT scans and export to PyBaMM.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **AMReX:** W. Zhang et al., *AMReX: a framework for block-structured adaptive mesh refinement*, J. Open Source Software 4(37), 1370 (2019). [doi:10.21105/joss.01370](https://doi.org/10.21105/joss.01370)\n3. **AMReX scaling:** A. S. Almgren et al., *Block-structured adaptive mesh refinement — theory, implementation and application*, J. Comput. Physics 332, 1-28 (2017). [doi:10.1016/j.jcp.2016.12.073](https://doi.org/10.1016/j.jcp.2016.12.073)\n4. **HYPRE:** R. D. Falgout & U. M. Yang, *hypre: A library of high performance preconditioners*, Computational Science — ICCS 2002, LNCS 2331, pp. 632-641 (2002). [doi:10.1007/3-540-47789-6_66](https://doi.org/10.1007/3-540-47789-6_66)\n5. **Parallel HDF5:** The HDF Group, *HDF5 — Parallel I/O*, [hdfgroup.org](https://www.hdfgroup.org/solutions/hdf5/)\n6. **Apptainer/Singularity for HPC:** G. M. Kurtzer et al., *Singularity: Scientific containers for mobility of compute*, PLoS ONE 12(5), e0177459 (2017). [doi:10.1371/journal.pone.0177459](https://doi.org/10.1371/journal.pone.0177459)\n7. **MPI standard:** Message Passing Interface Forum, *MPI: A Message-Passing Interface Standard, Version 4.0* (2021). [mpi-forum.org](https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf)"
+   "source": "## Summary\n\n| Scenario | Approach |\n|----------|----------|\n| **Laptop / Colab** | `pip install openimpala`, use NumPy arrays directly |\n| **Small cluster (1-16 cores)** | `mpirun -np 16 python script.py` with NumPy loading |\n| **Large cluster (16-128+ cores)** | `mpirun -np 128 python script.py` with `oi.read_image()` for parallel I/O |\n| **HPC without Python** | `mpirun -np 128 Diffusion ./inputs` (pure C++ application) |\n\n### Solver Quick Reference\n\n| Solver | Best For | Notes |\n|--------|----------|-------|\n| **FlexGMRES** | General use | Robust default, handles non-symmetric systems |\n| **PCG** | Symmetric problems | Fastest when applicable |\n| **SMG/PFMG** | Structured grids | Geometric multigrid, excellent on regular domains |\n| **BiCGSTAB** | Non-symmetric | Alternative to GMRES |\n\nThe API is the same at every scale \u2014 only the launch command changes. The scaling study and solver comparison in Sections 6-7 provide concrete data to guide your deployment decisions.\n\n**Back to the beginning:**\n- [Tutorial 1: Getting Started](01_hello_openimpala.ipynb) \u2014 Core workflow with synthetic microstructures.\n- [Tutorial 2: From 3D Image to Device Model](02_digital_twin.ipynb) \u2014 Load real CT scans and export to PyBaMM.\n\n---\n\n## References & Further Reading\n\n1. **OpenImpala:** S. Mayner, F. Ciucci, *OpenImpala: open-source computational framework for microstructural analysis of 3D tomography data*, SoftwareX (2024). [GitHub](https://github.com/BASE-Laboratory/OpenImpala)\n2. **AMReX:** W. Zhang et al., *AMReX: a framework for block-structured adaptive mesh refinement*, J. Open Source Software 4(37), 1370 (2019). [doi:10.21105/joss.01370](https://doi.org/10.21105/joss.01370)\n3. **AMReX scaling:** A. S. Almgren et al., *Block-structured adaptive mesh refinement \u2014 theory, implementation and application*, J. Comput. Physics 332, 1-28 (2017). [doi:10.1016/j.jcp.2016.12.073](https://doi.org/10.1016/j.jcp.2016.12.073)\n4. **HYPRE:** R. D. Falgout & U. M. Yang, *hypre: A library of high performance preconditioners*, Computational Science \u2014 ICCS 2002, LNCS 2331, pp. 632-641 (2002). [doi:10.1007/3-540-47789-6_66](https://doi.org/10.1007/3-540-47789-6_66)\n5. **Parallel HDF5:** The HDF Group, *HDF5 \u2014 Parallel I/O*, [hdfgroup.org](https://www.hdfgroup.org/solutions/hdf5/)\n6. **Apptainer/Singularity for HPC:** G. M. Kurtzer et al., *Singularity: Scientific containers for mobility of compute*, PLoS ONE 12(5), e0177459 (2017). [doi:10.1371/journal.pone.0177459](https://doi.org/10.1371/journal.pone.0177459)\n7. **MPI standard:** Message Passing Interface Forum, *MPI: A Message-Passing Interface Standard, Version 4.0* (2021). [mpi-forum.org](https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf)"
   }
  ],
  "metadata": {

From ac22502e5e17050539b1b2f47d681005af0e64f7 Mon Sep 17 00:00:00 2001
From: James Le Houx <james.lehoux@gre.ac.uk>
Date: Sat, 4 Apr 2026 12:05:13 +0000
Subject: [PATCH 3/3] Add CUDA runtime packages back to tutorials needing
 openimpala-cuda

The GPU wheel excludes CUDA runtime .so files (libcudart, libcublas,
etc.) so they must be pip-installed separately on Colab. Added
nvidia-cuda-runtime-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12,
nvidia-curand-cu12 to tutorials 02, 04, 07 install cells.

https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk
---
 tutorials/02_digital_twin.ipynb          | 2 +-
 tutorials/04_multiphase_and_fields.ipynb | 2 +-
 tutorials/07_hpc_scaling.ipynb           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tutorials/02_digital_twin.ipynb b/tutorials/02_digital_twin.ipynb
index d111938c..6527ca08 100644
--- a/tutorials/02_digital_twin.ipynb
+++ b/tutorials/02_digital_twin.ipynb
@@ -10,7 +10,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest pybamm bpx tifffile matplotlib yt"
+   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 nvidia-cusparse-cu12 nvidia-curand-cu12 pybamm bpx tifffile matplotlib yt"
   },
   {
    "cell_type": "code",
diff --git a/tutorials/04_multiphase_and_fields.ipynb b/tutorials/04_multiphase_and_fields.ipynb
index 2b41eb83..1600ba46 100644
--- a/tutorials/04_multiphase_and_fields.ipynb
+++ b/tutorials/04_multiphase_and_fields.ipynb
@@ -10,7 +10,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest porespy yt matplotlib"
+   "source": "# Install OpenImpala (compiled C++ backend needed for low-level API in this tutorial)\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 nvidia-cusparse-cu12 nvidia-curand-cu12 porespy yt matplotlib"
   },
   {
    "cell_type": "code",
diff --git a/tutorials/07_hpc_scaling.ipynb b/tutorials/07_hpc_scaling.ipynb
index 608d2256..7f2beeb9 100644
--- a/tutorials/07_hpc_scaling.ipynb
+++ b/tutorials/07_hpc_scaling.ipynb
@@ -242,7 +242,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# This tutorial demonstrates HPC features that require the compiled C++ backend\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest porespy matplotlib"
+   "source": "# This tutorial demonstrates HPC features that require the compiled C++ backend\n!pip install -q openimpala-cuda --find-links https://github.com/BASE-Laboratory/OpenImpala/releases/latest nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 nvidia-cusparse-cu12 nvidia-curand-cu12 porespy matplotlib"
   },
   {
    "cell_type": "code",