From be593adc18081af74574b708d8d53aba9a27b012 Mon Sep 17 00:00:00 2001 From: James Le Houx <37665786+jameslehoux@users.noreply.github.com> Date: Sun, 29 Mar 2026 15:36:38 +0100 Subject: [PATCH 01/24] Create initial draft for OpenImpala paper Add initial draft of OpenImpala paper detailing its framework and advancements. --- paper.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 paper.md diff --git a/paper.md b/paper.md new file mode 100644 index 00000000..1aa428e8 --- /dev/null +++ b/paper.md @@ -0,0 +1,64 @@ +--- +title: 'OpenImpala: A High-Performance, Image-Based Microstructural Parameterisation Engine for Porous Media' +tags: + - C++ + - Python + - high-performance computing + - porous media + - battery modelling + - AMReX + - tortuosity +authors: + - name: James Le Houx + orcid: 0000-0002-1576-0673 + corresponding: true + email: james.le-houx@stfc.ac.uk + affiliation: "1, 2, 3" + - name: Denis Kramer + orcid: 0000-0003-0605-1047 + affiliation: 4 +affiliations: + - name: University of Greenwich, Old Royal Naval College, Park Row, London, SE10 9LS, United Kingdom + index: 1 + - name: ISIS Neutron & Muon Source, Rutherford Appleton Laboratory, Didcot, OX11 0QX, United Kingdom + index: 2 + - name: The Faraday Institution, Harwell Science and Innovation Campus, Didcot, OX11 0RA, United Kingdom + index: 3 + - name: Helmut Schmidt University, Hamburg, Germany + index: 4 +date: 29 March 2026 +bibliography: paper.bib +--- +# Summary + +`OpenImpala` is a high-performance computing (HPC) framework built upon the AMReX library, designed for the direct evaluation of effective transport properties—such as tortuosity and effective electrical conductivity—from three-dimensional microstructural imaging data (e.g., X-ray Computed Tomography). By formulating the discrete governing equations directly on Cartesian voxel grids, `OpenImpala` circumvents the computationally prohibitive mesh-generation bottlenecks typically associated with conventional finite element methods (FEM). The framework couples a highly scalable, distributed-memory C++ computational backend with a flexible Python interface, facilitating the automated parameterisation of complex porous media for downstream systems-level continuum modelling. + +# Statement of Need + +High-resolution three-dimensional microstructural imaging is increasingly utilized across battery research, geosciences, and materials engineering to characterize internal transport phenomena. However, the extraction of bulk effective physical parameters from billion-voxel datasets represents a significant computational challenge. Conventional finite element tools often encounter tractability limits during the mesh-generation phase when applied to complex, highly tortuous microstructures. Conversely, existing open-source voxel-based solvers, while highly accessible, generally lack the distributed-memory Message Passing Interface (MPI) scalability required to process massive, out-of-core datasets on HPC architectures. + +`OpenImpala` addresses this methodological gap by providing a natively parallelised (MPI, OpenMP, and CUDA) matrix-free AMReX backend capable of scaling across thousands of compute cores. Through its modern Python application programming interface (API), `OpenImpala` serves as a robust upstream microstructural parameterisation engine. It is capable of ingesting raw or segmented tomographic data and seamlessly exporting computed effective macroscopic properties to downstream continuum models, such as the widely adopted PyBaMM battery modelling framework. + +# Major Advancements Since Initial Publication + +`OpenImpala` was originally introduced as a command-line hybrid C++ and Fortran framework optimized for solving Laplace's equation to extract tortuosity. Since its initial publication in *SoftwareX*, the framework has undergone a complete architectural transformation. This overhaul was motivated by three primary objectives: achieving exascale readiness on heterogeneous hardware, enabling seamless interoperability with the broader scientific Python ecosystem, and ensuring long-term software sustainability. + +To address the computational demands of increasingly massive tomographic datasets, the core architecture was heavily modernised. The legacy Fortran compute kernels were entirely deprecated in favor of a pure C++ infrastructure utilizing native AMReX Lambdas. This shift facilitated the implementation of native GPU acceleration via CUDA, allowing the solver to fully leverage modern heterogeneous supercomputing architectures. Furthermore, to optimize memory utilization during massive out-of-core computations, the monolithic diffusion driver was deconstructed. The framework now utilizes intelligent solver routing driven by AMReX’s Matrix-Free Multi-Level Multi-Grid (MLMG) infrastructure, all managed by a modernised, modular CMake build system. + +Beyond raw computational performance, the framework was redesigned to function as an accessible, upstream parameterisation engine for modern continuum modelling workflows. Recognizing that the battery and geoscience modelling communities operate predominantly within Python, `OpenImpala` was transitioned from a standalone compiled tool to a fully importable Python library (`import openimpala`) via `pybind11`. Distribution was modernised using `pyproject.toml` and `cibuildwheel` to enable standard package manager installations (e.g., `pip`). This transition to a community-accessible library is supported by a comprehensive Sphinx/ReadTheDocs infrastructure, featuring extensive tutorial series and Doxygen-generated API references. + +Finally, the physical fidelity and scientific rigor of the framework were significantly expanded. `OpenImpala` now supports complex multi-phase computations, allowing users to map specific phase identifiers to unique transport coefficients, alongside an integrated microstructural parameterisation engine capable of extracting macroscopic metrics such as Specific Surface Area (SSA), Representative Elementary Volumes (REV), and Pore Size Distributions (PSD). To guarantee the mathematical correctness of these new physics modules, rigorous software engineering practices were adopted, including automated CI/CD GitHub Actions pipelines, integrated Catch2 testing, synthetic analytical benchmarking, and comprehensive input validation to prevent silent numerical failures in complex HPC environments. + +* **Automated CI/CD:** Deployed rigorous GitHub Actions pipelines for automated code formatting (`clang-format`), static analysis (`clang-tidy`), and test coverage reporting via Codecov. + +# Future Directions + +Active development is focused on deepening integration with the broader scientific Python ecosystem. This includes the implementation of a direct memory-coupling API for PyBaMM, enabling researchers to perform 3D microstructural parameterisation and 1D electrochemical simulation in a single, zero-copy Python script. + +# Acknowledgements + +This work was financially supported by the EPSRC Centre for Doctoral Training (CDT) in Energy Storage and its Applications [grant ref: EP/R021295/1]; the Ada Lovelace Centre (ALC) STFC project, CANVAS-NXtomo; the EPSRC prosperity partnership with Imperial College, INFUSE [grant ref: EP/V038044/1]; the Rutherford Appleton Laboratory; The Faraday Institution through James Le Houx's Emerging Leader Fellowship [Grant No. FIELF001]; and Research England’s ‘Expanding Excellence in England’ grant at the University of Greenwich via the “Multi-scale Multi-disciplinary Modelling for Impact” (M34Impact) programme. + +The authors acknowledge the use of the IRIDIS High Performance Computing Facility, Diamond Light Source's Wilson HPC cluster, STFC's SCARF cluster, and the University of Greenwich's M34Impact HPC Cluster. We also thank the developers of AMReX, HYPRE, libtiff, and HDF5, upon which OpenImpala relies. + +# References From e889e251b8ca00cfdf25fc4ec6966c38f1be173a Mon Sep 17 00:00:00 2001 From: James Le Houx <37665786+jameslehoux@users.noreply.github.com> Date: Sun, 29 Mar 2026 15:39:20 +0100 Subject: [PATCH 02/24] Add bibliography entries for various research articles Added multiple references to the bibliography file including articles on OpenImpala, statistical effective diffusivity estimation, AMReX framework, and Python Battery Mathematical Modelling. --- paper.bib | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 paper.bib diff --git a/paper.bib b/paper.bib new file mode 100644 index 00000000..999e8553 --- /dev/null +++ b/paper.bib @@ -0,0 +1,73 @@ +@article{LeHoux2021OpenImpala, + title = {{{OpenImpala}}: {{OPEN}} source {{IMage}} based {{PArallisable}} {{Linear}} {{Algebra}} solver}, + author = {Le Houx, James and Kramer, Denis}, + year = {2021}, + journal = {SoftwareX}, + volume = {15}, + pages = {100729}, + doi = {10.1016/j.softx.2021.100729}, + issn = {2352-7110} +} + +@article{le2023statistical, + title={Statistical Effective Diffusivity Estimation in Porous Media Using an Integrated On-site Imaging Workflow for Synchrotron Users}, + author={Le Houx, James and Ruiz, Siul and McKay Fletcher, Daniel and Ahmed, Sharif and Roose, Tiina}, + journal={Transport in Porous Media}, + volume={150}, + number={1}, + pages={71--88}, + year={2023}, + publisher={Springer}, + doi={10.1007/s11242-023-01993-7} +} + +@article{amrex2019, + title={AMReX: a framework for block-structured adaptive mesh refinement}, + author={Zhang, Weiqun and Almgren, Ann and Beckner, Vince and Bell, John and Blaschke, Johannes and Chan, Cy and Day, Marcus and Friesen, Brian and Gott, Kevin and Graves, Daniel and others}, + journal={Journal of Open Source Software}, + volume={4}, + number={37}, + pages={1370}, + year={2019}, + doi={10.21105/joss.01370} +} + +@article{lu2025immersed, + title={An immersed interface Adaptive Mesh Refinement algorithm for Li-ion battery simulations. II. Multi-dimensional extension and separator modeling}, + author={Lu, Jiawei and Nikiforakis, Nikolaos and Gokhale, Nandan}, + journal={Journal of Applied Physics}, + volume={138}, + number={4}, + pages={045002}, + year={2025}, + publisher={AIP Publishing}, + doi={10.1063/5.0281626} +} + +@article{sulzer2021python, + title={{Python Battery Mathematical Modelling (PyBaMM)}}, + author={Sulzer, Valentin and Marquis, Scott G and Timms, Robert and Robinson, Martin and Chapman, S Jon}, + journal={Journal of Open Research Software}, + volume={9}, + number={1}, + year={2021}, + publisher={Ubiquity Press}, + doi={10.5334/jors.309} +} + +@inproceedings{falgout2002hypre, + title={hypre: A library of high performance preconditioners}, + author={Falgout, Robert D and Yang, Ulrike Meier}, + booktitle={International Conference on Computational Science}, + pages={632--641}, + year={2002}, + organization={Springer}, + doi={10.1007/3-540-47789-6_66} +} + +@misc{jakob2017pybind11, + author = {Wenzel Jakob and Jason Rhinelander and Dean Moldovan}, + year = {2017}, + note = {https://github.com/pybind/pybind11}, + title = {pybind11 -- Seamless operability between C++11 and Python} +} From d756f42d13fae4f1527fb898073465ef7179ae79 Mon Sep 17 00:00:00 2001 From: James Le Houx <37665786+jameslehoux@users.noreply.github.com> Date: Sun, 29 Mar 2026 15:42:29 +0100 Subject: [PATCH 03/24] Revise software architecture and capabilities section Updated the software architecture section to reflect recent changes and improvements in the OpenImpala framework, including its transition to a Python library and enhancements in computational capabilities. --- paper.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/paper.md b/paper.md index 1aa428e8..5df46cea 100644 --- a/paper.md +++ b/paper.md @@ -39,17 +39,13 @@ High-resolution three-dimensional microstructural imaging is increasingly utiliz `OpenImpala` addresses this methodological gap by providing a natively parallelised (MPI, OpenMP, and CUDA) matrix-free AMReX backend capable of scaling across thousands of compute cores. Through its modern Python application programming interface (API), `OpenImpala` serves as a robust upstream microstructural parameterisation engine. It is capable of ingesting raw or segmented tomographic data and seamlessly exporting computed effective macroscopic properties to downstream continuum models, such as the widely adopted PyBaMM battery modelling framework. -# Major Advancements Since Initial Publication +# Software Architecture and Capabilities -`OpenImpala` was originally introduced as a command-line hybrid C++ and Fortran framework optimized for solving Laplace's equation to extract tortuosity. Since its initial publication in *SoftwareX*, the framework has undergone a complete architectural transformation. This overhaul was motivated by three primary objectives: achieving exascale readiness on heterogeneous hardware, enabling seamless interoperability with the broader scientific Python ecosystem, and ensuring long-term software sustainability. +`OpenImpala` is engineered to achieve exascale readiness on heterogeneous hardware, enable seamless interoperability with the broader scientific Python ecosystem, and ensure long-term software sustainability. To address the computational demands of increasingly massive tomographic datasets, the core architecture is built upon a pure C++ infrastructure utilizing native AMReX Lambdas. This design facilitates native GPU acceleration via CUDA, allowing the solver to fully leverage modern heterogeneous supercomputing architectures. Furthermore, to optimize memory utilization during massive out-of-core computations, the framework utilizes intelligent solver routing driven by AMReX’s Matrix-Free Multi-Level Multi-Grid (MLMG) infrastructure, all managed by a modern, modular CMake build system. -To address the computational demands of increasingly massive tomographic datasets, the core architecture was heavily modernised. The legacy Fortran compute kernels were entirely deprecated in favor of a pure C++ infrastructure utilizing native AMReX Lambdas. This shift facilitated the implementation of native GPU acceleration via CUDA, allowing the solver to fully leverage modern heterogeneous supercomputing architectures. Furthermore, to optimize memory utilization during massive out-of-core computations, the monolithic diffusion driver was deconstructed. The framework now utilizes intelligent solver routing driven by AMReX’s Matrix-Free Multi-Level Multi-Grid (MLMG) infrastructure, all managed by a modernised, modular CMake build system. +Beyond raw computational performance, the framework functions as an accessible, upstream parameterisation engine for modern continuum modelling workflows. Recognizing that the battery and geoscience modelling communities operate predominantly within Python, `OpenImpala` acts as a fully importable Python library (`import openimpala`) via `pybind11`. Distribution is handled using `pyproject.toml` and `cibuildwheel` to enable standard package manager installations (e.g., `pip`). This community-accessible library is supported by a comprehensive Sphinx and ReadTheDocs infrastructure, featuring extensive tutorial series and Doxygen-generated API references. -Beyond raw computational performance, the framework was redesigned to function as an accessible, upstream parameterisation engine for modern continuum modelling workflows. Recognizing that the battery and geoscience modelling communities operate predominantly within Python, `OpenImpala` was transitioned from a standalone compiled tool to a fully importable Python library (`import openimpala`) via `pybind11`. Distribution was modernised using `pyproject.toml` and `cibuildwheel` to enable standard package manager installations (e.g., `pip`). This transition to a community-accessible library is supported by a comprehensive Sphinx/ReadTheDocs infrastructure, featuring extensive tutorial series and Doxygen-generated API references. - -Finally, the physical fidelity and scientific rigor of the framework were significantly expanded. `OpenImpala` now supports complex multi-phase computations, allowing users to map specific phase identifiers to unique transport coefficients, alongside an integrated microstructural parameterisation engine capable of extracting macroscopic metrics such as Specific Surface Area (SSA), Representative Elementary Volumes (REV), and Pore Size Distributions (PSD). To guarantee the mathematical correctness of these new physics modules, rigorous software engineering practices were adopted, including automated CI/CD GitHub Actions pipelines, integrated Catch2 testing, synthetic analytical benchmarking, and comprehensive input validation to prevent silent numerical failures in complex HPC environments. - -* **Automated CI/CD:** Deployed rigorous GitHub Actions pipelines for automated code formatting (`clang-format`), static analysis (`clang-tidy`), and test coverage reporting via Codecov. +Finally, the physical fidelity and scientific rigor of the framework are a primary focus. `OpenImpala` supports complex multi-phase computations, allowing users to map specific phase identifiers to unique transport coefficients. It also features an integrated microstructural parameterisation engine capable of extracting macroscopic metrics such as Specific Surface Area (SSA), Representative Elementary Volumes (REV), and Pore Size Distributions (PSD). To guarantee the mathematical correctness of these physics modules, rigorous software engineering practices are enforced. This includes comprehensive input validation to prevent silent numerical failures in complex HPC environments, synthetic analytical benchmarking, integrated Catch2 testing, and automated CI/CD GitHub Actions pipelines for code formatting (`clang-format`), static analysis (`clang-tidy`), and test coverage reporting via Codecov. # Future Directions From 395b8ec1947ede49d6d528313a4a762882ec5c7e Mon Sep 17 00:00:00 2001 From: James Le Houx Date: Sun, 29 Mar 2026 20:08:16 +0000 Subject: [PATCH 04/24] Add performance profiling and CI benchmark workflow (#31) - Enhance profiling notebook with AMReX TinyProfiler breakdown (solver setup vs linear solve vs flux computation) and NVIDIA Nsight Systems GPU kernel profiling for Colab T4 runtimes - Add CI benchmark workflow that runs on PRs touching solver code, tests against analytical solutions (uniform block tau=(N-1)/N), and posts timing results as PR comments for regression tracking https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk --- .github/workflows/benchmark.yml | 306 +++++++++++++++++++ notebooks/profiling_and_tuning.ipynb | 429 +++++++++++++++++++++++---- 2 files changed, 680 insertions(+), 55 deletions(-) create mode 100644 .github/workflows/benchmark.yml diff --git a/.github/workflows/benchmark.yml b/.github/workflows/benchmark.yml new file mode 100644 index 00000000..2d348bfb --- /dev/null +++ b/.github/workflows/benchmark.yml @@ -0,0 +1,306 @@ +# .github/workflows/benchmark.yml +# Performance regression tracking for OpenImpala. +# +# Runs a fixed set of synthetic benchmarks and posts timing results +# as a PR comment or job summary. Triggered manually or on PRs that +# touch solver code. + +name: Performance Benchmark + +on: + pull_request: + paths: + - 'src/props/**' + - 'python/bindings/**' + - 'python/openimpala/**' + - 'CMakeLists.txt' + workflow_dispatch: + inputs: + grid_sizes: + description: 'Comma-separated grid sizes to benchmark (e.g. 64,128)' + required: false + default: '64,128' + +concurrency: + group: benchmark-${{ github.ref }} + cancel-in-progress: true + +jobs: + benchmark: + name: Run Benchmarks + runs-on: ubuntu-latest + permissions: + pull-requests: write + + steps: + - name: Checkout repository + uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Apptainer + uses: eWaterCycle/setup-apptainer@v2 + with: + apptainer-version: 1.2.5 + + - name: Restore Dependency SIF Cache + id: cache-restore-sif + uses: actions/cache/restore@v4 + with: + path: dependency_image.sif + key: ${{ runner.os }}-apptainer-sif-${{ hashFiles('containers/Singularity.deps.def') }} + restore-keys: | + ${{ runner.os }}-apptainer-sif- + + - name: Build Dependency SIF (if cache miss) + if: steps.cache-restore-sif.outputs.cache-hit != 'true' + run: | + sudo apptainer build --force dependency_image.sif containers/Singularity.deps.def + + - name: Build OpenImpala with Python bindings + run: | + sudo apptainer exec \ + --writable-tmpfs \ + --bind $PWD:/src \ + ./dependency_image.sif \ + bash -c ' + source /opt/rh/gcc-toolset-11/enable + cd /src + rm -rf cmake_build_bench && mkdir cmake_build_bench && cd cmake_build_bench + + PYBIND11_DIR=$(python3.11 -m pybind11 --cmakedir) + + cmake .. \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_C_COMPILER=$(which mpicc) \ + -DCMAKE_CXX_COMPILER=$(which mpicxx) \ + -DPython_EXECUTABLE=/usr/bin/python3.11 \ + -Dpybind11_DIR=${PYBIND11_DIR} \ + -DCMAKE_PREFIX_PATH="/opt/amrex/25.03;/opt/hypre/v2.32.0;/opt/hdf5/1.12.3;/opt/libtiff/4.6.0" \ + -DBUILD_TESTING=OFF \ + -DOPENIMPALA_PYTHON=ON + + make -j$(nproc) + ' + + - name: Run benchmark suite + id: benchmark + run: | + # Determine grid sizes + if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then + SIZES="${{ github.event.inputs.grid_sizes }}" + else + SIZES="64,128" + fi + + cat > /tmp/bench.py << 'BENCHEOF' + import sys + import os + import json + import time + import numpy as np + + # Add the build directory to Python path + sys.path.insert(0, "/src/cmake_build_bench") + + import openimpala as oi + + sizes = [int(s) for s in os.environ.get("BENCH_SIZES", "64,128").split(",")] + solvers = ["pcg", "flexgmres", "pfmg", "mlmg"] + n_repeats = 3 + results = [] + + with oi.Session(): + for size in sizes: + print(f"\n=== Benchmarking {size}³ ===", flush=True) + + # Generate uniform block (analytical solution: tau = (N-1)/N) + data = np.zeros((size, size, size), dtype=np.int32) + + for solver_name in solvers: + times = [] + res = None + failed = False + + for trial in range(n_repeats): + try: + t0 = time.perf_counter() + res = oi.tortuosity( + data, phase=0, direction="z", + solver=solver_name, max_grid_size=32, verbose=0 + ) + t1 = time.perf_counter() + times.append(t1 - t0) + except Exception as e: + print(f" {solver_name}: FAILED — {e}") + failed = True + break + + if not failed and res is not None: + median_t = float(np.median(times)) + expected_tau = (size - 1) / size + tau_error = abs(res.tortuosity - expected_tau) / expected_tau + + record = { + "size": size, + "solver": solver_name, + "wall_time_s": round(median_t, 4), + "tortuosity": round(res.tortuosity, 6), + "expected_tau": round(expected_tau, 6), + "relative_error": round(tau_error, 8), + "iterations": res.iterations, + "converged": res.solver_converged, + } + results.append(record) + print(f" {solver_name:12s} {median_t:.4f}s " + f"τ={res.tortuosity:.6f} " + f"err={tau_error:.2e} " + f"iters={res.iterations}") + else: + results.append({ + "size": size, "solver": solver_name, + "wall_time_s": None, "tortuosity": None, + "expected_tau": (size - 1) / size, + "relative_error": None, "iterations": None, + "converged": False, + }) + + # Write results to JSON + with open("/tmp/benchmark_results.json", "w") as f: + json.dump(results, f, indent=2) + + print(f"\n{len(results)} benchmark results written.") + BENCHEOF + + sudo apptainer exec \ + --writable-tmpfs \ + --bind $PWD:/src \ + ./dependency_image.sif \ + bash -c " + source /opt/rh/gcc-toolset-11/enable + export PYTHONPATH=/src/cmake_build_bench + export BENCH_SIZES=$SIZES + cd /src && python3.11 /tmp/bench.py + " + + # Copy results out for the next step + cp /tmp/benchmark_results.json ./benchmark_results.json 2>/dev/null || true + + - name: Generate benchmark report + id: report + run: | + python3 << 'REPORTEOF' + import json + import os + + with open("benchmark_results.json") as f: + results = json.load(f) + + # Build markdown table + lines = [] + lines.append("## Performance Benchmark Results") + lines.append("") + lines.append("| Size | Solver | Wall Time (s) | Tortuosity | Expected | Rel. Error | Iters | Status |") + lines.append("|------|--------|--------------|------------|----------|-----------|-------|--------|") + + any_failed = False + any_regression = False + + for r in results: + if r["converged"]: + err_str = f'{r["relative_error"]:.2e}' + status = "PASS" if r["relative_error"] < 1e-4 else "WARN" + if r["relative_error"] >= 1e-4: + any_regression = True + status = "WARN" + else: + err_str = "N/A" + status = "FAIL" + any_failed = True + + time_str = f'{r["wall_time_s"]:.4f}' if r["wall_time_s"] else "N/A" + tau_str = f'{r["tortuosity"]:.6f}' if r["tortuosity"] else "N/A" + exp_str = f'{r["expected_tau"]:.6f}' + iter_str = str(r["iterations"]) if r["iterations"] else "N/A" + + lines.append(f'| {r["size"]}³ | {r["solver"]} | {time_str} | {tau_str} | {exp_str} | {err_str} | {iter_str} | {status} |') + + lines.append("") + + # Summary + converged = [r for r in results if r["converged"]] + if converged: + fastest = min(converged, key=lambda r: r["wall_time_s"]) + lines.append(f'**Fastest solver:** {fastest["solver"]} at {fastest["size"]}³ ' + f'({fastest["wall_time_s"]:.4f}s)') + if any_failed: + lines.append("**Warning:** Some solvers failed to converge.") + if any_regression: + lines.append("**Warning:** Some results exceed expected error tolerance (>1e-4).") + + lines.append("") + lines.append("*Benchmark: uniform block (analytical τ = (N-1)/N)*") + + report = "\n".join(lines) + + # Write to step summary + with open(os.environ.get("GITHUB_STEP_SUMMARY", "/dev/null"), "a") as f: + f.write(report) + + # Write to file for PR comment + with open("benchmark_report.md", "w") as f: + f.write(report) + + print(report) + REPORTEOF + + - name: Post benchmark results to PR + if: github.event_name == 'pull_request' + uses: actions/github-script@v7 + with: + script: | + const fs = require('fs'); + let report = ''; + try { + report = fs.readFileSync('benchmark_report.md', 'utf8'); + } catch (e) { + report = 'Benchmark report not available.'; + } + + const marker = ''; + const body = marker + '\n' + report; + + const { data: comments } = await github.rest.issues.listComments({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: context.issue.number, + }); + + const existing = comments.find(c => c.body.includes(marker)); + + if (existing) { + await github.rest.issues.updateComment({ + owner: context.repo.owner, + repo: context.repo.repo, + comment_id: existing.id, + body: body, + }); + } else { + await github.rest.issues.createComment({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: context.issue.number, + body: body, + }); + } + + - name: Upload benchmark artifacts + if: always() + uses: actions/upload-artifact@v4 + with: + name: benchmark-results-${{ github.run_id }} + path: | + benchmark_results.json + benchmark_report.md + retention-days: 90 + if-no-files-found: warn diff --git a/notebooks/profiling_and_tuning.ipynb b/notebooks/profiling_and_tuning.ipynb index 83866cce..b468dbd6 100644 --- a/notebooks/profiling_and_tuning.ipynb +++ b/notebooks/profiling_and_tuning.ipynb @@ -19,23 +19,7 @@ { "cell_type": "markdown", "metadata": {}, - "source": [ - "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/main/notebooks/profiling_and_tuning.ipynb)\n", - "\n", - "# OpenImpala Interactive Profiling & Tuning\n", - "\n", - "This notebook profiles solver performance, tunes AMReX grid decomposition, and compares scaling behavior — all within Google Colab's free tier (T4 GPU, ~12 GB RAM).\n", - "\n", - "**Sections:**\n", - "1. Synthetic Dataset Generation\n", - "2. Baseline Validation\n", - "3. Solver Profiling Sweep\n", - "4. Grid Size Sweep (`max_grid_size`)\n", - "5. Dataset Size Scaling\n", - "6. Direction Anisotropy Check\n", - "7. Multi-Porosity Comparison\n", - "8. Summary & HPC Recommendations" - ] + "source": "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BASE-Laboratory/OpenImpala/blob/main/notebooks/profiling_and_tuning.ipynb)\n\n# OpenImpala Interactive Profiling & Tuning\n\nThis notebook profiles solver performance, tunes AMReX grid decomposition, and compares scaling behavior \u2014 all within Google Colab's free tier (T4 GPU, ~12 GB RAM).\n\n**Sections:**\n1. Synthetic Dataset Generation\n2. Baseline Validation\n3. Solver Profiling Sweep\n4. Grid Size Sweep (`max_grid_size`)\n5. Dataset Size Scaling\n6. Direction Anisotropy Check\n7. Multi-Porosity Comparison\n8. **AMReX TinyProfiler Breakdown** *(solver setup vs solve vs flux)*\n9. **NVIDIA Nsight Systems GPU Profiling** *(kernel-level timeline)*\n10. Summary & HPC Recommendations" }, { "cell_type": "markdown", @@ -61,10 +45,10 @@ " if gpu_name:\n", " print(f\"GPU detected: {gpu_name}\")\n", " else:\n", - " print(\"No GPU detected — running CPU-only.\")\n", + " print(\"No GPU detected \u2014 running CPU-only.\")\n", "except FileNotFoundError:\n", " gpu_name = \"\"\n", - " print(\"nvidia-smi not found — running CPU-only.\")" + " print(\"nvidia-smi not found \u2014 running CPU-only.\")" ] }, { @@ -147,9 +131,9 @@ "data_medium = generate_microstructure(128)\n", "data_large = generate_microstructure(256)\n", "\n", - "for name, data in [(\"small (64³)\", data_small),\n", - " (\"medium (128³)\", data_medium),\n", - " (\"large (256³)\", data_large)]:\n", + "for name, data in [(\"small (64\u00b3)\", data_small),\n", + " (\"medium (128\u00b3)\", data_medium),\n", + " (\"large (256\u00b3)\", data_large)]:\n", " vf = np.sum(data == 0) / data.size\n", " print(f\"{name:16s} shape={str(data.shape):20s} porosity={vf:.3f}\")" ] @@ -164,7 +148,7 @@ "fig, ax = plt.subplots(figsize=(6, 6))\n", "mid_z = data_medium.shape[0] // 2\n", "ax.imshow(data_medium[mid_z, :, :], cmap=\"gray\", origin=\"lower\")\n", - "ax.set_title(f\"data_medium — Z-slice {mid_z} (white=solid, black=pore)\")\n", + "ax.set_title(f\"data_medium \u2014 Z-slice {mid_z} (white=solid, black=pore)\")\n", "ax.set_xlabel(\"X\")\n", "ax.set_ylabel(\"Y\")\n", "plt.tight_layout()\n", @@ -175,7 +159,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Section 2: Baseline — Quick Validation\n", + "## Section 2: Baseline \u2014 Quick Validation\n", "\n", "Run a quick sanity check on `data_small` to confirm the solver pipeline works." ] @@ -204,7 +188,7 @@ "source": [ "## Section 3: Solver Profiling Sweep\n", "\n", - "Profile all available HYPRE solvers on `data_medium` (128³). Each solver is run 3 times; we report the median wall time." + "Profile all available HYPRE solvers on `data_medium` (128\u00b3). Each solver is run 3 times; we report the median wall time." ] }, { @@ -231,7 +215,7 @@ " times.append(t1 - t0)\n", " last_result = res\n", " except Exception as e:\n", - " print(f\" {solver_name} trial {trial}: FAILED — {e}\")\n", + " print(f\" {solver_name} trial {trial}: FAILED \u2014 {e}\")\n", " failed = True\n", " break\n", "\n", @@ -251,7 +235,7 @@ " \"converged\": last_result.solver_converged\n", " })\n", " status = \"OK\" if not failed else \"FAILED\"\n", - " print(f\" {solver_name:12s} — {status}\")\n", + " print(f\" {solver_name:12s} \u2014 {status}\")\n", "\n", "df_solvers = pd.DataFrame(solver_records)\n", "df_solvers" @@ -262,7 +246,7 @@ "execution_count": null, "metadata": {}, "outputs": [], - "source": "# Dual-axis bar chart: wall time (bars) + iteration count (dots)\nfig, ax1 = plt.subplots(figsize=(10, 5))\n\ncolors = [\"#2ecc71\" if c else \"#e74c3c\" for c in df_solvers[\"converged\"]]\ny_pos = np.arange(len(df_solvers))\nbars = ax1.barh(y_pos, df_solvers[\"wall_time_s\"], color=colors, alpha=0.85,\n edgecolor=\"white\", linewidth=0.8)\nax1.set_yticks(y_pos)\nax1.set_yticklabels(df_solvers[\"solver\"])\nax1.set_xlabel(\"Wall Time (s)\", color=\"#2c3e50\")\nax1.set_title(\"Solver Profiling — 128³ dataset\", fontsize=13, fontweight=\"bold\")\nax1.tick_params(axis=\"x\", colors=\"#2c3e50\")\n\n# Second x-axis for iteration count\nax2 = ax1.twiny()\nax2.plot(df_solvers[\"iterations\"], y_pos, \"D\", color=\"#8e44ad\", markersize=8,\n zorder=5, label=\"Iterations\")\nax2.set_xlabel(\"Iterations\", color=\"#8e44ad\")\nax2.tick_params(axis=\"x\", colors=\"#8e44ad\")\n\n# Convergence legend\nfrom matplotlib.patches import Patch\nlegend_elements = [Patch(facecolor=\"#2ecc71\", label=\"Converged\"),\n Patch(facecolor=\"#e74c3c\", label=\"Failed\"),\n plt.Line2D([0], [0], marker=\"D\", color=\"#8e44ad\", linestyle=\"None\",\n markersize=8, label=\"Iterations\")]\nax1.legend(handles=legend_elements, loc=\"lower right\", framealpha=0.9)\n\nplt.tight_layout()\nplt.show()" + "source": "# Dual-axis bar chart: wall time (bars) + iteration count (dots)\nfig, ax1 = plt.subplots(figsize=(10, 5))\n\ncolors = [\"#2ecc71\" if c else \"#e74c3c\" for c in df_solvers[\"converged\"]]\ny_pos = np.arange(len(df_solvers))\nbars = ax1.barh(y_pos, df_solvers[\"wall_time_s\"], color=colors, alpha=0.85,\n edgecolor=\"white\", linewidth=0.8)\nax1.set_yticks(y_pos)\nax1.set_yticklabels(df_solvers[\"solver\"])\nax1.set_xlabel(\"Wall Time (s)\", color=\"#2c3e50\")\nax1.set_title(\"Solver Profiling \u2014 128\u00b3 dataset\", fontsize=13, fontweight=\"bold\")\nax1.tick_params(axis=\"x\", colors=\"#2c3e50\")\n\n# Second x-axis for iteration count\nax2 = ax1.twiny()\nax2.plot(df_solvers[\"iterations\"], y_pos, \"D\", color=\"#8e44ad\", markersize=8,\n zorder=5, label=\"Iterations\")\nax2.set_xlabel(\"Iterations\", color=\"#8e44ad\")\nax2.tick_params(axis=\"x\", colors=\"#8e44ad\")\n\n# Convergence legend\nfrom matplotlib.patches import Patch\nlegend_elements = [Patch(facecolor=\"#2ecc71\", label=\"Converged\"),\n Patch(facecolor=\"#e74c3c\", label=\"Failed\"),\n plt.Line2D([0], [0], marker=\"D\", color=\"#8e44ad\", linestyle=\"None\",\n markersize=8, label=\"Iterations\")]\nax1.legend(handles=legend_elements, loc=\"lower right\", framealpha=0.9)\n\nplt.tight_layout()\nplt.show()" }, { "cell_type": "code", @@ -307,7 +291,7 @@ " \"wall_time_s\": np.median(times),\n", " \"tortuosity\": res.tortuosity\n", " })\n", - " print(f\" max_grid_size={gs:4d} time={np.median(times):.3f}s τ={res.tortuosity:.4f}\")\n", + " print(f\" max_grid_size={gs:4d} time={np.median(times):.3f}s \u03c4={res.tortuosity:.4f}\")\n", "\n", "df_grid = pd.DataFrame(grid_records)\n", "df_grid" @@ -323,14 +307,14 @@ "ax.plot(df_grid[\"max_grid_size\"], df_grid[\"wall_time_s\"], \"o-\", linewidth=2, markersize=8)\n", "ax.set_xlabel(\"max_grid_size\")\n", "ax.set_ylabel(\"Wall Time (s)\")\n", - "ax.set_title(f\"Grid Decomposition Sweep — 128³, solver={best_solver}\")\n", + "ax.set_title(f\"Grid Decomposition Sweep \u2014 128\u00b3, solver={best_solver}\")\n", "\n", "# Annotations\n", - "ax.annotate(\"Smaller boxes →\\nbetter cache locality,\\nmore MPI parallelism\",\n", + "ax.annotate(\"Smaller boxes \u2192\\nbetter cache locality,\\nmore MPI parallelism\",\n", " xy=(grid_sizes[0], df_grid[\"wall_time_s\"].iloc[0]),\n", " xytext=(grid_sizes[1], df_grid[\"wall_time_s\"].max() * 0.9),\n", " arrowprops=dict(arrowstyle=\"->\", color=\"gray\"), fontsize=9, color=\"gray\")\n", - "ax.annotate(\"Larger boxes →\\nbetter GPU saturation,\\nless overhead\",\n", + "ax.annotate(\"Larger boxes \u2192\\nbetter GPU saturation,\\nless overhead\",\n", " xy=(grid_sizes[-1], df_grid[\"wall_time_s\"].iloc[-1]),\n", " xytext=(grid_sizes[-2], df_grid[\"wall_time_s\"].max() * 0.9),\n", " arrowprops=dict(arrowstyle=\"->\", color=\"gray\"), fontsize=9, color=\"gray\")\n", @@ -356,7 +340,7 @@ }, { "cell_type": "code", - "source": "# Solver x Grid Size heatmap — full 2D parameter sweep\nconverged_solvers = df_solvers[df_solvers[\"converged\"]][\"solver\"].tolist()\nheatmap_grid_sizes = [8, 16, 32, 64, 128]\nheatmap_data = np.full((len(converged_solvers), len(heatmap_grid_sizes)), np.nan)\n\nfor i, s in enumerate(converged_solvers):\n for j, gs in enumerate(heatmap_grid_sizes):\n try:\n t0 = time.perf_counter()\n oi.tortuosity(data_medium, phase=0, direction=\"z\",\n solver=s, max_grid_size=gs, verbose=0)\n heatmap_data[i, j] = time.perf_counter() - t0\n except Exception:\n heatmap_data[i, j] = np.nan\n print(f\" {s} done\")\n\nfig, ax = plt.subplots(figsize=(9, max(4, len(converged_solvers) * 0.7)))\nim = ax.imshow(heatmap_data, cmap=\"YlOrRd\", aspect=\"auto\")\n\n# Annotate each cell with its value\nfor i in range(heatmap_data.shape[0]):\n for j in range(heatmap_data.shape[1]):\n val = heatmap_data[i, j]\n if not np.isnan(val):\n text_color = \"white\" if val > np.nanmedian(heatmap_data) else \"black\"\n ax.text(j, i, f\"{val:.2f}s\", ha=\"center\", va=\"center\",\n fontsize=9, fontweight=\"bold\", color=text_color)\n\nax.set_xticks(range(len(heatmap_grid_sizes)))\nax.set_xticklabels(heatmap_grid_sizes)\nax.set_yticks(range(len(converged_solvers)))\nax.set_yticklabels(converged_solvers)\nax.set_xlabel(\"max_grid_size\")\nax.set_ylabel(\"Solver\")\nax.set_title(\"Solver x Grid Size — Wall Time (s)\", fontsize=13, fontweight=\"bold\")\n\n# Mark the global minimum\nbest_idx = np.unravel_index(np.nanargmin(heatmap_data), heatmap_data.shape)\nax.add_patch(plt.Rectangle((best_idx[1] - 0.5, best_idx[0] - 0.5), 1, 1,\n fill=False, edgecolor=\"#2ecc71\", linewidth=3, linestyle=\"--\"))\nax.text(best_idx[1], best_idx[0] + 0.35, \"BEST\", ha=\"center\", va=\"top\",\n fontsize=7, color=\"#2ecc71\", fontweight=\"bold\")\n\ncbar = plt.colorbar(im, ax=ax, shrink=0.8)\ncbar.set_label(\"Wall Time (s)\")\nplt.tight_layout()\nplt.show()", + "source": "# Solver x Grid Size heatmap \u2014 full 2D parameter sweep\nconverged_solvers = df_solvers[df_solvers[\"converged\"]][\"solver\"].tolist()\nheatmap_grid_sizes = [8, 16, 32, 64, 128]\nheatmap_data = np.full((len(converged_solvers), len(heatmap_grid_sizes)), np.nan)\n\nfor i, s in enumerate(converged_solvers):\n for j, gs in enumerate(heatmap_grid_sizes):\n try:\n t0 = time.perf_counter()\n oi.tortuosity(data_medium, phase=0, direction=\"z\",\n solver=s, max_grid_size=gs, verbose=0)\n heatmap_data[i, j] = time.perf_counter() - t0\n except Exception:\n heatmap_data[i, j] = np.nan\n print(f\" {s} done\")\n\nfig, ax = plt.subplots(figsize=(9, max(4, len(converged_solvers) * 0.7)))\nim = ax.imshow(heatmap_data, cmap=\"YlOrRd\", aspect=\"auto\")\n\n# Annotate each cell with its value\nfor i in range(heatmap_data.shape[0]):\n for j in range(heatmap_data.shape[1]):\n val = heatmap_data[i, j]\n if not np.isnan(val):\n text_color = \"white\" if val > np.nanmedian(heatmap_data) else \"black\"\n ax.text(j, i, f\"{val:.2f}s\", ha=\"center\", va=\"center\",\n fontsize=9, fontweight=\"bold\", color=text_color)\n\nax.set_xticks(range(len(heatmap_grid_sizes)))\nax.set_xticklabels(heatmap_grid_sizes)\nax.set_yticks(range(len(converged_solvers)))\nax.set_yticklabels(converged_solvers)\nax.set_xlabel(\"max_grid_size\")\nax.set_ylabel(\"Solver\")\nax.set_title(\"Solver x Grid Size \u2014 Wall Time (s)\", fontsize=13, fontweight=\"bold\")\n\n# Mark the global minimum\nbest_idx = np.unravel_index(np.nanargmin(heatmap_data), heatmap_data.shape)\nax.add_patch(plt.Rectangle((best_idx[1] - 0.5, best_idx[0] - 0.5), 1, 1,\n fill=False, edgecolor=\"#2ecc71\", linewidth=3, linestyle=\"--\"))\nax.text(best_idx[1], best_idx[0] + 0.35, \"BEST\", ha=\"center\", va=\"top\",\n fontsize=7, color=\"#2ecc71\", fontweight=\"bold\")\n\ncbar = plt.colorbar(im, ax=ax, shrink=0.8)\ncbar.set_label(\"Wall Time (s)\")\nplt.tight_layout()\nplt.show()", "metadata": {}, "execution_count": null, "outputs": [] @@ -376,7 +360,7 @@ "metadata": {}, "outputs": [], "source": [ - "datasets = [(\"64³\", data_small, 64), (\"128³\", data_medium, 128), (\"256³\", data_large, 256)]\n", + "datasets = [(\"64\u00b3\", data_small, 64), (\"128\u00b3\", data_medium, 128), (\"256\u00b3\", data_large, 256)]\n", "size_records = []\n", "\n", "for label, data, size in datasets:\n", @@ -392,7 +376,7 @@ " \"label\": label, \"size\": size, \"n_voxels\": size**3,\n", " \"wall_time_s\": median_t, \"tortuosity\": res.tortuosity\n", " })\n", - " print(f\" {label} time={median_t:.3f}s τ={res.tortuosity:.4f}\")\n", + " print(f\" {label} time={median_t:.3f}s \u03c4={res.tortuosity:.4f}\")\n", "\n", "df_size = pd.DataFrame(size_records)\n", "df_size" @@ -403,7 +387,7 @@ "execution_count": null, "metadata": {}, "outputs": [], - "source": "# Log-log plot with filled area under the scaling curve\nlog_n = np.log10(df_size[\"n_voxels\"].values.astype(float))\nlog_t = np.log10(df_size[\"wall_time_s\"].values)\ncoeffs = np.polyfit(log_n, log_t, 1)\nexponent = coeffs[0]\n\n# Dense interpolation for smooth fill\nn_dense = np.logspace(log_n.min(), log_n.max(), 200)\nt_dense = 10 ** np.polyval(coeffs, np.log10(n_dense))\n\nfig, ax = plt.subplots(figsize=(9, 5))\n\n# Filled area under measured curve (gradient effect via stacked fills)\nfrom matplotlib.colors import LinearSegmentedColormap\nn_vals = df_size[\"n_voxels\"].values.astype(float)\nt_vals = df_size[\"wall_time_s\"].values\nax.fill_between(n_dense, t_dense, alpha=0.15, color=\"#3498db\")\nax.fill_between(n_vals, t_vals, alpha=0.25, color=\"#3498db\", step=\"mid\")\n\n# Power-law fit line\nax.loglog(n_dense, t_dense, \"--\", color=\"#95a5a6\", linewidth=1.5,\n label=f\"Power-law fit: O(N$^{{{exponent:.2f}}}$)\")\n\n# Measured data points (prominent)\nax.loglog(n_vals, t_vals, \"o-\", color=\"#2c3e50\", linewidth=2.5,\n markersize=12, markerfacecolor=\"#3498db\", markeredgecolor=\"#2c3e50\",\n markeredgewidth=2, label=\"Measured\", zorder=5)\n\nax.set_xlabel(\"Number of Voxels\", fontsize=11)\nax.set_ylabel(\"Wall Time (s)\", fontsize=11)\nax.set_title(f\"Size Scaling — solver={best_solver}, max_grid_size={best_grid_size}\",\n fontsize=13, fontweight=\"bold\")\nax.legend(fontsize=10, framealpha=0.9)\n\nfor _, row in df_size.iterrows():\n ax.annotate(row[\"label\"], (row[\"n_voxels\"], row[\"wall_time_s\"]),\n textcoords=\"offset points\", xytext=(12, 8), fontsize=10,\n fontweight=\"bold\", color=\"#2c3e50\",\n arrowprops=dict(arrowstyle=\"->\", color=\"#bdc3c7\", lw=1.2))\n\nplt.tight_layout()\nplt.show()\nprint(f\"Scaling exponent: {exponent:.2f}\")" + "source": "# Log-log plot with filled area under the scaling curve\nlog_n = np.log10(df_size[\"n_voxels\"].values.astype(float))\nlog_t = np.log10(df_size[\"wall_time_s\"].values)\ncoeffs = np.polyfit(log_n, log_t, 1)\nexponent = coeffs[0]\n\n# Dense interpolation for smooth fill\nn_dense = np.logspace(log_n.min(), log_n.max(), 200)\nt_dense = 10 ** np.polyval(coeffs, np.log10(n_dense))\n\nfig, ax = plt.subplots(figsize=(9, 5))\n\n# Filled area under measured curve (gradient effect via stacked fills)\nfrom matplotlib.colors import LinearSegmentedColormap\nn_vals = df_size[\"n_voxels\"].values.astype(float)\nt_vals = df_size[\"wall_time_s\"].values\nax.fill_between(n_dense, t_dense, alpha=0.15, color=\"#3498db\")\nax.fill_between(n_vals, t_vals, alpha=0.25, color=\"#3498db\", step=\"mid\")\n\n# Power-law fit line\nax.loglog(n_dense, t_dense, \"--\", color=\"#95a5a6\", linewidth=1.5,\n label=f\"Power-law fit: O(N$^{{{exponent:.2f}}}$)\")\n\n# Measured data points (prominent)\nax.loglog(n_vals, t_vals, \"o-\", color=\"#2c3e50\", linewidth=2.5,\n markersize=12, markerfacecolor=\"#3498db\", markeredgecolor=\"#2c3e50\",\n markeredgewidth=2, label=\"Measured\", zorder=5)\n\nax.set_xlabel(\"Number of Voxels\", fontsize=11)\nax.set_ylabel(\"Wall Time (s)\", fontsize=11)\nax.set_title(f\"Size Scaling \u2014 solver={best_solver}, max_grid_size={best_grid_size}\",\n fontsize=13, fontweight=\"bold\")\nax.legend(fontsize=10, framealpha=0.9)\n\nfor _, row in df_size.iterrows():\n ax.annotate(row[\"label\"], (row[\"n_voxels\"], row[\"wall_time_s\"]),\n textcoords=\"offset points\", xytext=(12, 8), fontsize=10,\n fontweight=\"bold\", color=\"#2c3e50\",\n arrowprops=dict(arrowstyle=\"->\", color=\"#bdc3c7\", lw=1.2))\n\nplt.tight_layout()\nplt.show()\nprint(f\"Scaling exponent: {exponent:.2f}\")" }, { "cell_type": "markdown", @@ -427,7 +411,7 @@ " res = oi.tortuosity(data_medium, phase=0, direction=d,\n", " solver=best_solver, max_grid_size=best_grid_size, verbose=0)\n", " tau_values[d] = res.tortuosity\n", - " print(f\" τ_{d} = {res.tortuosity:.4f}\")" + " print(f\" \u03c4_{d} = {res.tortuosity:.4f}\")" ] }, { @@ -435,7 +419,7 @@ "execution_count": null, "metadata": {}, "outputs": [], - "source": "# Radar (spider) chart for directional anisotropy\nlabels = [f\"$\\\\tau_{d}$\" for d in directions]\nvalues = [tau_values[d] for d in directions]\n\n# Close the polygon\nangles = np.linspace(0, 2 * np.pi, len(directions), endpoint=False).tolist()\nvalues_closed = values + [values[0]]\nangles_closed = angles + [angles[0]]\n\nfig, ax = plt.subplots(figsize=(6, 6), subplot_kw=dict(polar=True))\n\n# Filled radar area\nax.fill(angles_closed, values_closed, color=\"#3498db\", alpha=0.2)\nax.plot(angles_closed, values_closed, \"o-\", color=\"#3498db\", linewidth=2.5,\n markersize=10, markerfacecolor=\"white\", markeredgecolor=\"#3498db\",\n markeredgewidth=2.5)\n\n# Value labels at each vertex\nfor angle, val, d in zip(angles, values, directions):\n ax.annotate(f\"{val:.3f}\", xy=(angle, val),\n textcoords=\"offset points\", xytext=(8, 8),\n fontsize=11, fontweight=\"bold\", color=\"#2c3e50\")\n\n# Isotropic reference circle (mean tortuosity)\ntau_mean = np.mean(values)\nref_angles = np.linspace(0, 2 * np.pi, 100)\nax.plot(ref_angles, [tau_mean] * len(ref_angles), \"--\", color=\"#e74c3c\",\n linewidth=1.5, alpha=0.6, label=f\"Isotropic ref (τ={tau_mean:.3f})\")\n\nax.set_xticks(angles)\nax.set_xticklabels([f\"$\\\\tau_{{{d.upper()}}}$\" for d in directions], fontsize=13)\nax.set_title(\"Direction Anisotropy — 128³ isotropic blobs\\n\",\n fontsize=13, fontweight=\"bold\")\nax.legend(loc=\"upper right\", bbox_to_anchor=(1.3, 1.1), framealpha=0.9)\n\nplt.tight_layout()\nplt.show()" + "source": "# Radar (spider) chart for directional anisotropy\nlabels = [f\"$\\\\tau_{d}$\" for d in directions]\nvalues = [tau_values[d] for d in directions]\n\n# Close the polygon\nangles = np.linspace(0, 2 * np.pi, len(directions), endpoint=False).tolist()\nvalues_closed = values + [values[0]]\nangles_closed = angles + [angles[0]]\n\nfig, ax = plt.subplots(figsize=(6, 6), subplot_kw=dict(polar=True))\n\n# Filled radar area\nax.fill(angles_closed, values_closed, color=\"#3498db\", alpha=0.2)\nax.plot(angles_closed, values_closed, \"o-\", color=\"#3498db\", linewidth=2.5,\n markersize=10, markerfacecolor=\"white\", markeredgecolor=\"#3498db\",\n markeredgewidth=2.5)\n\n# Value labels at each vertex\nfor angle, val, d in zip(angles, values, directions):\n ax.annotate(f\"{val:.3f}\", xy=(angle, val),\n textcoords=\"offset points\", xytext=(8, 8),\n fontsize=11, fontweight=\"bold\", color=\"#2c3e50\")\n\n# Isotropic reference circle (mean tortuosity)\ntau_mean = np.mean(values)\nref_angles = np.linspace(0, 2 * np.pi, 100)\nax.plot(ref_angles, [tau_mean] * len(ref_angles), \"--\", color=\"#e74c3c\",\n linewidth=1.5, alpha=0.6, label=f\"Isotropic ref (\u03c4={tau_mean:.3f})\")\n\nax.set_xticks(angles)\nax.set_xticklabels([f\"$\\\\tau_{{{d.upper()}}}$\" for d in directions], fontsize=13)\nax.set_title(\"Direction Anisotropy \u2014 128\u00b3 isotropic blobs\\n\",\n fontsize=13, fontweight=\"bold\")\nax.legend(loc=\"upper right\", bbox_to_anchor=(1.3, 1.1), framealpha=0.9)\n\nplt.tight_layout()\nplt.show()" }, { "cell_type": "markdown", @@ -443,7 +427,7 @@ "source": [ "## Section 7: Multi-Porosity Comparison\n", "\n", - "Generate datasets at varying porosities and compare measured tortuosity against the Bruggeman relation (τ = ε^−0.5), a common approximation for isotropic porous media." + "Generate datasets at varying porosities and compare measured tortuosity against the Bruggeman relation (\u03c4 = \u03b5^\u22120.5), a common approximation for isotropic porous media." ] }, { @@ -462,14 +446,14 @@ " try:\n", " perc = oi.percolation_check(data_p, phase=0, direction=\"z\", verbose=0)\n", " if not perc.percolates:\n", - " print(f\" ε={p:.1f} — not percolating, skipped\")\n", + " print(f\" \u03b5={p:.1f} \u2014 not percolating, skipped\")\n", " porosity_records.append({\n", " \"porosity\": p, \"volume_fraction\": vf.fraction,\n", " \"tortuosity\": np.nan, \"percolates\": False\n", " })\n", " continue\n", " except Exception:\n", - " print(f\" ε={p:.1f} — percolation check failed, skipped\")\n", + " print(f\" \u03b5={p:.1f} \u2014 percolation check failed, skipped\")\n", " porosity_records.append({\n", " \"porosity\": p, \"volume_fraction\": vf.fraction,\n", " \"tortuosity\": np.nan, \"percolates\": False\n", @@ -482,7 +466,7 @@ " \"porosity\": p, \"volume_fraction\": vf.fraction,\n", " \"tortuosity\": res.tortuosity, \"percolates\": True\n", " })\n", - " print(f\" ε={p:.1f} VF={vf.fraction:.3f} τ={res.tortuosity:.4f}\")\n", + " print(f\" \u03b5={p:.1f} VF={vf.fraction:.3f} \u03c4={res.tortuosity:.4f}\")\n", "\n", "df_porosity = pd.DataFrame(porosity_records)\n", "df_porosity" @@ -499,7 +483,7 @@ "# Bruggeman reference\n", "eps_ref = np.linspace(0.2, 0.8, 100)\n", "tau_brugg = eps_ref ** (-0.5)\n", - "ax.plot(eps_ref, tau_brugg, \"--\", color=\"gray\", linewidth=2, label=\"Bruggeman (τ = ε⁻⁰·⁵)\")\n", + "ax.plot(eps_ref, tau_brugg, \"--\", color=\"gray\", linewidth=2, label=\"Bruggeman (\u03c4 = \u03b5\u207b\u2070\u00b7\u2075)\")\n", "\n", "# Measured values\n", "df_valid = df_porosity[df_porosity[\"percolates\"]]\n", @@ -512,9 +496,9 @@ " ax.scatter(df_noperc[\"porosity\"], [0.5] * len(df_noperc), marker=\"x\",\n", " s=100, color=\"black\", zorder=5, label=\"Non-percolating\")\n", "\n", - "ax.set_xlabel(\"Porosity (ε)\")\n", - "ax.set_ylabel(\"Tortuosity (τ)\")\n", - "ax.set_title(\"Tortuosity vs. Porosity — 128³ synthetic microstructures\")\n", + "ax.set_xlabel(\"Porosity (\u03b5)\")\n", + "ax.set_ylabel(\"Tortuosity (\u03c4)\")\n", + "ax.set_title(\"Tortuosity vs. Porosity \u2014 128\u00b3 synthetic microstructures\")\n", "ax.legend()\n", "plt.tight_layout()\n", "plt.show()" @@ -524,23 +508,350 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The Bruggeman relation (τ = ε⁻⁰·⁵) is a simple effective-medium approximation. Real (or realistic synthetic) microstructures typically deviate due to pore connectivity, bottlenecks, and geometric tortuosity effects that the Bruggeman model does not capture." + "The Bruggeman relation (\u03c4 = \u03b5\u207b\u2070\u00b7\u2075) is a simple effective-medium approximation. Real (or realistic synthetic) microstructures typically deviate due to pore connectivity, bottlenecks, and geometric tortuosity effects that the Bruggeman model does not capture." + ] + }, + { + "cell_type": "markdown", + "source": "## Section 8: AMReX TinyProfiler \u2014 Internal Breakdown\n\nAMReX ships with a built-in **TinyProfiler** that hooks into every `BL_PROFILE`-annotated function in the C++ source. OpenImpala instruments 24 key functions \u2014 solver setup, matrix assembly, HYPRE solve, flux computation, and more.\n\nWhen the AMReX session finalizes, TinyProfiler prints a breakdown table to stdout. We capture this output by restarting the session with verbose output, running a solve, then parsing the profiler dump.\n\n**This section answers #31's core questions:**\n- Solver iterations vs. setup time\n- Matrix assembly vs. linear solve\n- Flux computation vs. everything else", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Close the current session so TinyProfiler dumps its output,\n# then restart with verbose output to capture the profiler table.\n\nimport io\nimport sys\nimport re\n\n# Close existing session cleanly\nsession.__exit__(None, None, None)\n\ndef run_with_profiler(data, solver=\"pcg\", direction=\"z\", max_grid_size=32):\n \"\"\"Run a solve in a fresh AMReX session and capture TinyProfiler output.\"\"\"\n # Capture stdout during the entire session lifecycle\n captured = io.StringIO()\n old_stdout = sys.stdout\n\n # Start fresh session\n s = oi.Session()\n s.__enter__()\n\n # Run the solve with verbose=1 to get internal timing prints\n res = oi.tortuosity(data, phase=0, direction=direction,\n solver=solver, max_grid_size=max_grid_size, verbose=1)\n\n # Finalize \u2014 this is when TinyProfiler prints its summary\n sys.stdout = captured\n s.__exit__(None, None, None)\n sys.stdout = old_stdout\n\n return res, captured.getvalue()\n\n# Run profiled solve on 128\u00b3 dataset\nprint(f\"Running profiled solve (128\u00b3, solver={best_solver})...\")\nprof_result, prof_output = run_with_profiler(data_medium, solver=best_solver)\nprint(f\"Tortuosity: {prof_result.tortuosity:.4f}\")\nprint(f\"Converged: {prof_result.solver_converged}\")\nprint(f\"\\nTinyProfiler output length: {len(prof_output)} chars\")\n\n# Show raw output for inspection\nif prof_output.strip():\n print(\"\\n--- Raw TinyProfiler Output ---\")\n print(prof_output[:3000])\nelse:\n print(\"\\nNote: TinyProfiler output may be empty if AMReX was built without\")\n print(\"TINY_PROFILE support. The wall-time profiling below still works.\")", + "metadata": {}, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": "def parse_tiny_profiler(output):\n \"\"\"Parse AMReX TinyProfiler output into a DataFrame.\n\n TinyProfiler prints tables like:\n Name NCalls Excl. Min Excl. Avg Excl. Max Max %\n TortuosityHypre::solve 1 0.45 0.45 0.45 52.3%\n \"\"\"\n records = []\n # Match lines with function name, ncalls, and timing columns\n pattern = re.compile(\n r\"^\\s*(\\S+.*\\S)\\s+\" # function name\n r\"(\\d+)\\s+\" # NCalls\n r\"([\\d.e+-]+)\\s+\" # Excl. Min\n r\"([\\d.e+-]+)\\s+\" # Excl. Avg\n r\"([\\d.e+-]+)\\s+\" # Excl. Max\n r\"([\\d.e+-]+)\\s*%\", # Max %\n re.MULTILINE\n )\n for m in pattern.finditer(output):\n records.append({\n \"function\": m.group(1).strip(),\n \"ncalls\": int(m.group(2)),\n \"excl_min_s\": float(m.group(3)),\n \"excl_avg_s\": float(m.group(4)),\n \"excl_max_s\": float(m.group(5)),\n \"max_pct\": float(m.group(6)),\n })\n return pd.DataFrame(records)\n\ndf_prof = parse_tiny_profiler(prof_output)\n\nif len(df_prof) > 0:\n df_prof = df_prof.sort_values(\"excl_avg_s\", ascending=False).head(15)\n print(f\"Top {len(df_prof)} functions by exclusive time:\")\n print(df_prof.to_string(index=False))\nelse:\n # Fall back to wall-time breakdown if TinyProfiler isn't available\n print(\"TinyProfiler data not available \u2014 using wall-time phase breakdown instead.\")\n print(\"Running phase-by-phase timing...\")\n\n # Restart session for manual timing\n session = oi.Session()\n session.__enter__()\n\n # Phase 1: Volume fraction + percolation check (pre-solve)\n t0 = time.perf_counter()\n vf = oi.volume_fraction(data_medium, phase=0)\n perc = oi.percolation_check(data_medium, phase=0, direction=\"z\")\n t_presolve = time.perf_counter() - t0\n\n # Phase 2: Full tortuosity solve (includes setup + matrix fill + solve + flux)\n t0 = time.perf_counter()\n res = oi.tortuosity(data_medium, phase=0, direction=\"z\",\n solver=best_solver, max_grid_size=best_grid_size, verbose=0)\n t_solve_total = time.perf_counter() - t0\n\n # Phase 3: Repeat at different sizes to estimate setup vs compute scaling\n t0 = time.perf_counter()\n res_small = oi.tortuosity(data_small, phase=0, direction=\"z\",\n solver=best_solver, max_grid_size=best_grid_size, verbose=0)\n t_solve_small = time.perf_counter() - t0\n\n # Estimate: setup overhead \u2248 time that doesn't scale with problem size\n # If 64\u00b3 takes t1 and 128\u00b3 takes t2, then t2/t1 \u2248 scaling factor\n # The \"flat\" part (setup + HYPRE init) can be estimated\n df_prof = pd.DataFrame([\n {\"phase\": \"Pre-solve checks\", \"time_s\": t_presolve},\n {\"phase\": f\"Full solve (128\u00b3)\", \"time_s\": t_solve_total},\n {\"phase\": f\"Full solve (64\u00b3)\", \"time_s\": t_solve_small},\n ])\n scale_ratio = t_solve_total / max(t_solve_small, 1e-9)\n print(f\"\\n Pre-solve (VF + percolation): {t_presolve:.4f}s\")\n print(f\" Full solve 64\u00b3: {t_solve_small:.4f}s\")\n print(f\" Full solve 128\u00b3: {t_solve_total:.4f}s\")\n print(f\" Scaling ratio (128/64): {scale_ratio:.2f}x\")", + "metadata": {}, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "outputs": [], + "execution_count": null, + "source": [ + "# Visualize the profiler breakdown\n", + "if \"max_pct\" in df_prof.columns and len(df_prof) > 0:\n", + " # TinyProfiler horizontal bar chart\n", + " fig, ax = plt.subplots(figsize=(10, max(4, len(df_prof) * 0.4)))\n", + "\n", + " # Color by category\n", + " def categorize(name):\n", + " name_l = name.lower()\n", + " if \"solve\" in name_l:\n", + " return \"#e74c3c\" # red\n", + " elif any(k in name_l for k in [\"setup\", \"fill\", \"matrix\", \"stencil\"]):\n", + " return \"#f39c12\" # orange\n", + " elif any(k in name_l for k in [\"flux\", \"plane\"]):\n", + " return \"#3498db\" # blue\n", + " elif any(k in name_l for k in [\"mask\", \"precondition\", \"flood\", \"percolation\"]):\n", + " return \"#2ecc71\" # green\n", + " else:\n", + " return \"#95a5a6\" # gray\n", + "\n", + " colors = [categorize(f) for f in df_prof[\"function\"]]\n", + " y_pos = np.arange(len(df_prof))\n", + "\n", + " ax.barh(y_pos, df_prof[\"excl_avg_s\"], color=colors, alpha=0.85,\n", + " edgecolor=\"white\", linewidth=0.8)\n", + " for i, (_, row) in enumerate(df_prof.iterrows()):\n", + " ax.text(row[\"excl_avg_s\"] + 0.001, i, f'{row[\"max_pct\"]:.1f}%',\n", + " va=\"center\", fontsize=9, color=\"#2c3e50\")\n", + " ax.set_yticks(y_pos)\n", + " ax.set_yticklabels(df_prof[\"function\"], fontsize=9)\n", + " ax.set_xlabel(\"Exclusive Time (s)\")\n", + " ax.set_title(\"AMReX TinyProfiler \u2014 Function Breakdown\", fontsize=13, fontweight=\"bold\")\n", + " ax.invert_yaxis()\n", + "\n", + " from matplotlib.patches import Patch\n", + " legend_elements = [\n", + " Patch(facecolor=\"#e74c3c\", label=\"Linear solve\"),\n", + " Patch(facecolor=\"#f39c12\", label=\"Matrix assembly/setup\"),\n", + " Patch(facecolor=\"#3498db\", label=\"Flux computation\"),\n", + " Patch(facecolor=\"#2ecc71\", label=\"Pre-processing\"),\n", + " Patch(facecolor=\"#95a5a6\", label=\"Other\"),\n", + " ]\n", + " ax.legend(handles=legend_elements, loc=\"lower right\", framealpha=0.9)\n", + " plt.tight_layout()\n", + " plt.show()\n", + "\n", + " # Pie chart of time categories\n", + " categories = {\"Linear solve\": 0, \"Matrix assembly\": 0,\n", + " \"Flux computation\": 0, \"Pre-processing\": 0, \"Other\": 0}\n", + " for _, row in df_prof.iterrows():\n", + " name_l = row[\"function\"].lower()\n", + " if \"solve\" in name_l:\n", + " categories[\"Linear solve\"] += row[\"excl_avg_s\"]\n", + " elif any(k in name_l for k in [\"setup\", \"fill\", \"matrix\", \"stencil\"]):\n", + " categories[\"Matrix assembly\"] += row[\"excl_avg_s\"]\n", + " elif any(k in name_l for k in [\"flux\", \"plane\"]):\n", + " categories[\"Flux computation\"] += row[\"excl_avg_s\"]\n", + " elif any(k in name_l for k in [\"mask\", \"precondition\", \"flood\", \"percolation\"]):\n", + " categories[\"Pre-processing\"] += row[\"excl_avg_s\"]\n", + " else:\n", + " categories[\"Other\"] += row[\"excl_avg_s\"]\n", + "\n", + " cat_colors = [\"#e74c3c\", \"#f39c12\", \"#3498db\", \"#2ecc71\", \"#95a5a6\"]\n", + " fig, ax = plt.subplots(figsize=(7, 7))\n", + " wedges, texts, autotexts = ax.pie(\n", + " categories.values(), labels=categories.keys(), colors=cat_colors,\n", + " autopct=\"%1.1f%%\", startangle=90, textprops={\"fontsize\": 11})\n", + " for t in autotexts:\n", + " t.set_fontweight(\"bold\")\n", + " ax.set_title(\"Time Distribution by Category\", fontsize=13, fontweight=\"bold\")\n", + " plt.tight_layout()\n", + " plt.show()\n", + "\n", + "else:\n", + " # Fallback: bar chart from manual phase timing\n", + " fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))\n", + "\n", + " phases = df_prof[\"phase\"].tolist()\n", + " times_s = df_prof[\"time_s\"].tolist()\n", + " colors = [\"#2ecc71\", \"#e74c3c\", \"#3498db\"]\n", + "\n", + " ax1.barh(phases, times_s, color=colors[:len(phases)], alpha=0.85)\n", + " ax1.set_xlabel(\"Wall Time (s)\")\n", + " ax1.set_title(\"Phase Timing Breakdown\", fontweight=\"bold\")\n", + "\n", + " if len(times_s) >= 3:\n", + " ax2.bar([\"64\\u00b3\", \"128\\u00b3\"], [times_s[2], times_s[1]], color=\"#e74c3c\", alpha=0.85)\n", + " ax2.bar([\"64\\u00b3\", \"128\\u00b3\"], [times_s[0], times_s[0]],\n", + " bottom=[times_s[2], times_s[1]], color=\"#2ecc71\", alpha=0.85,\n", + " label=\"Pre-solve overhead\")\n", + " ax2.set_ylabel(\"Wall Time (s)\")\n", + " ax2.set_title(\"Setup vs Compute Scaling\", fontweight=\"bold\")\n", + " ax2.legend()\n", + "\n", + " plt.tight_layout()\n", + " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Section 8: Summary & HPC Recommendations" + "## Section 9: NVIDIA Nsight Systems \u2014 GPU Kernel Profiling\n", + "\n", + "If a GPU is available, we use **NVIDIA Nsight Systems** (`nsys`) to capture a\n", + "detailed timeline of CUDA kernel launches, memory transfers, and API calls.\n", + "This reveals:\n", + "\n", + "- **GPU kernel occupancy** \u2014 are kernels large enough to saturate the GPU?\n", + "- **Host-device transfers** \u2014 how much time is spent copying data?\n", + "- **Kernel launch overhead** \u2014 are there too many tiny kernel launches?\n", + "- **Idle gaps** \u2014 is the GPU waiting on CPU work between kernels?\n", + "\n", + "> Colab provides `nsys` pre-installed on GPU runtimes. On HPC clusters,\n", + "> load it via `module load nsight-systems`." ] }, { "cell_type": "code", + "metadata": {}, + "outputs": [], "execution_count": null, + "source": [ + "import shutil\n", + "import os\n", + "\n", + "nsys_available = shutil.which(\"nsys\") is not None and bool(gpu_name)\n", + "\n", + "if nsys_available:\n", + " print(f\"nsys found: {shutil.which('nsys')}\")\n", + " print(f\"GPU: {gpu_name}\")\n", + " print(\"\\nRunning Nsight Systems profile...\")\n", + "\n", + " # Write a self-contained profiling script\n", + " script = '''\n", + "import openimpala as oi\n", + "import numpy as np\n", + "import porespy as ps\n", + "\n", + "np.random.seed(42)\n", + "data = ps.generators.blobs(shape=[128, 128, 128], porosity=0.5, blobiness=1.5)\n", + "data = data.astype(np.int32)\n", + "\n", + "with oi.Session():\n", + " res = oi.tortuosity(data, phase=0, direction=\"z\", solver=\"pcg\",\n", + " max_grid_size=64, verbose=0)\n", + " print(f\"tau={res.tortuosity:.4f} converged={res.solver_converged}\")\n", + "'''\n", + " with open('/tmp/oi_profile.py', 'w') as f:\n", + " f.write(script)\n", + "\n", + " # Run nsys profile \u2014 capture CUDA API, kernels, and memory ops\n", + " !nsys profile \\\n", + " --output=/tmp/oi_profile \\\n", + " --force-overwrite=true \\\n", + " --trace=cuda,nvtx,osrt \\\n", + " --cuda-memory-usage=true \\\n", + " --stats=true \\\n", + " python3 /tmp/oi_profile.py 2>&1 | tail -80\n", + "\n", + " print(\"\\nProfile saved to /tmp/oi_profile.nsys-rep\")\n", + " print(\"Download and open in Nsight Systems GUI for interactive analysis.\")\n", + "else:\n", + " if not gpu_name:\n", + " print(\"No GPU detected \u2014 skipping Nsight Systems profiling.\")\n", + " print(\"To enable: Runtime > Change runtime type > T4 GPU\")\n", + " else:\n", + " print(\"nsys not found \u2014 install via: apt-get install nsight-systems\")\n", + " print(\"\\nSkipping to summary.\")" + ] + }, + { + "cell_type": "code", "metadata": {}, "outputs": [], + "execution_count": null, "source": [ - "# Estimated time for 256³ from measured data\n", + "if nsys_available:\n", + " # Parse the nsys stats output for CUDA kernel summary\n", + " nsys_stats = subprocess.run(\n", + " [\"nsys\", \"stats\", \"--report\", \"cuda_gpu_kern_sum\",\n", + " \"--format\", \"csv\", \"/tmp/oi_profile.nsys-rep\"],\n", + " capture_output=True, text=True, timeout=60\n", + " )\n", + "\n", + " if nsys_stats.returncode == 0 and nsys_stats.stdout.strip():\n", + " # Parse CSV output\n", + " from io import StringIO\n", + " import csv\n", + "\n", + " lines = nsys_stats.stdout.strip().split('\\n')\n", + " # Find the header line (contains 'Time')\n", + " header_idx = None\n", + " for i, line in enumerate(lines):\n", + " if 'Time' in line and 'Name' in line:\n", + " header_idx = i\n", + " break\n", + "\n", + " if header_idx is not None:\n", + " csv_text = '\\n'.join(lines[header_idx:])\n", + " df_kernels = pd.read_csv(StringIO(csv_text))\n", + " # Show top kernels by total time\n", + " if 'Total Time (ns)' in df_kernels.columns:\n", + " df_kernels['Total Time (ms)'] = df_kernels['Total Time (ns)'] / 1e6\n", + " df_kernels = df_kernels.sort_values('Total Time (ns)', ascending=False).head(10)\n", + " print(\"Top 10 CUDA Kernels by Total GPU Time:\")\n", + " print(df_kernels[['Name', 'Total Time (ms)', 'Instances']].to_string(index=False))\n", + " else:\n", + " print(\"Kernel summary:\")\n", + " print(df_kernels.head(10).to_string(index=False))\n", + " else:\n", + " print(\"Could not parse nsys kernel stats.\")\n", + " print(nsys_stats.stdout[:2000])\n", + " else:\n", + " print(\"nsys stats returned no kernel data.\")\n", + " if nsys_stats.stderr:\n", + " print(nsys_stats.stderr[:500])" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "outputs": [], + "execution_count": null, + "source": [ + "if nsys_available:\n", + " # Also get memory transfer stats\n", + " nsys_mem = subprocess.run(\n", + " [\"nsys\", \"stats\", \"--report\", \"cuda_gpu_mem_size_sum\",\n", + " \"--format\", \"csv\", \"/tmp/oi_profile.nsys-rep\"],\n", + " capture_output=True, text=True, timeout=60\n", + " )\n", + "\n", + " if nsys_mem.returncode == 0 and nsys_mem.stdout.strip():\n", + " lines = nsys_mem.stdout.strip().split('\\n')\n", + " header_idx = None\n", + " for i, line in enumerate(lines):\n", + " if 'Operation' in line or 'Total' in line:\n", + " header_idx = i\n", + " break\n", + " if header_idx is not None:\n", + " csv_text = '\\n'.join(lines[header_idx:])\n", + " df_mem = pd.read_csv(StringIO(csv_text))\n", + " print(\"\\nCUDA Memory Transfer Summary:\")\n", + " print(df_mem.to_string(index=False))\n", + " else:\n", + " print(\"\\nMemory transfer data:\")\n", + " print(nsys_mem.stdout[:1000])\n", + "\n", + " # Visualize kernel distribution if we have data\n", + " try:\n", + " if 'df_kernels' in dir() and len(df_kernels) > 0 and 'Total Time (ms)' in df_kernels.columns:\n", + " fig, ax = plt.subplots(figsize=(10, max(3, len(df_kernels) * 0.35)))\n", + "\n", + " # Shorten kernel names for display\n", + " short_names = []\n", + " for n in df_kernels['Name']:\n", + " # Extract the last meaningful part of mangled kernel names\n", + " parts = str(n).split('::')\n", + " short = parts[-1] if len(parts) > 1 else str(n)\n", + " short = short[:50] + '...' if len(short) > 50 else short\n", + " short_names.append(short)\n", + "\n", + " y_pos = np.arange(len(df_kernels))\n", + " ax.barh(y_pos, df_kernels['Total Time (ms)'].values,\n", + " color='#e67e22', alpha=0.85, edgecolor='white')\n", + " ax.set_yticks(y_pos)\n", + " ax.set_yticklabels(short_names, fontsize=8)\n", + " ax.set_xlabel('Total GPU Time (ms)')\n", + " ax.set_title('CUDA Kernel Time Distribution', fontsize=13, fontweight='bold')\n", + " ax.invert_yaxis()\n", + " plt.tight_layout()\n", + " plt.show()\n", + " except Exception as e:\n", + " print(f\"Could not plot kernel distribution: {e}\")\n", + "\n", + " print(\"\\nFor interactive analysis, download /tmp/oi_profile.nsys-rep\")\n", + " print(\"and open it in the Nsight Systems GUI (free from NVIDIA).\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Interpreting Nsight Systems results\n", + "\n", + "**Key things to look for:**\n", + "\n", + "| Pattern | Meaning | Action |\n", + "|---------|---------|--------|\n", + "| Many tiny kernels (<1 ms each) | Kernel launch overhead dominates | Increase `max_grid_size` to fuse work |\n", + "| Large H2D/D2H transfers | Data copying bottleneck | Check if arrays are being copied repeatedly |\n", + "| Long gaps between kernels | CPU is the bottleneck | Look for serial CPU work between GPU calls |\n", + "| One dominant kernel (>80% time) | Solver kernel is the hot path | Focus optimization there (expected for large problems) |\n", + "| Even distribution across kernels | Setup/assembly is significant | Consider caching matrix assembly across solves |\n", + "\n", + "**On HPC clusters** with multiple GPUs, combine `nsys` with MPI profiling:\n", + "```bash\n", + "mpirun -np 4 nsys profile --output=profile_rank%q{OMPI_COMM_WORLD_RANK} \\\n", + " openimpala inputs_256.txt\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Section 10: Summary & HPC Recommendations" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Estimated time for 256\u00b3 from measured data\n", "time_256 = df_size.loc[df_size[\"size\"] == 256, \"wall_time_s\"].values[0]\n", "\n", "print(\"=\" * 60)\n", @@ -548,13 +859,13 @@ "print(\"=\" * 60)\n", "print(f\" Best solver: {best_solver}\")\n", "print(f\" Optimal max_grid_size: {best_grid_size}\")\n", - "print(f\" Time per 256³ solve: {time_256:.2f} s\")\n", + "print(f\" Time per 256\u00b3 solve: {time_256:.2f} s\")\n", "\n", "# Memory regime estimate\n", - "for label, size in [(\"256³\", 256), (\"400³\", 400), (\"512³\", 512)]:\n", + "for label, size in [(\"256\u00b3\", 256), (\"400\u00b3\", 400), (\"512\u00b3\", 512)]:\n", " mem_gb = (size ** 3 * 8 * 3) / 1e9 # rough: 3 arrays of float64\n", " regime = \"safe\" if mem_gb < 4 else (\"caution\" if mem_gb < 10 else \"DANGER\")\n", - " print(f\" Memory {label}: ~{mem_gb:.1f} GB — [{regime}]\")\n", + " print(f\" Memory {label}: ~{mem_gb:.1f} GB \u2014 [{regime}]\")\n", "print(\"=\" * 60)" ] }, @@ -573,7 +884,7 @@ "for size in [256, 512, 1024, 2048]:\n", " n = size ** 3\n", " est_time = 10 ** np.polyval(coeffs, np.log10(n))\n", - " print(f\"{size:>5d}³ {n:>12,d} {est_time:>13.1f}\")\n", + " print(f\"{size:>5d}\u00b3 {n:>12,d} {est_time:>13.1f}\")\n", "\n", "print(\"-\" * 50)\n", "print(\"Note: These are rough estimates from a single T4.\")\n", @@ -620,8 +931,16 @@ "metadata": {}, "outputs": [], "source": [ - "session.__exit__(None, None, None)\n", - "print(\"OpenImpala session closed.\")" + "# Clean up \u2014 session may already be closed from TinyProfiler section\n", + "try:\n", + " from openimpala._core import amrex_initialized\n", + " if amrex_initialized():\n", + " session.__exit__(None, None, None)\n", + " print(\"OpenImpala session closed.\")\n", + " else:\n", + " print(\"Session already finalized (closed during TinyProfiler section).\")\n", + "except Exception:\n", + " print(\"Session cleanup complete.\")" ] } ] From 23c770cae4c1e5cafc9f0404286dc05413207ae4 Mon Sep 17 00:00:00 2001 From: James Le Houx Date: Sun, 29 Mar 2026 20:12:37 +0000 Subject: [PATCH 05/24] Fix PyPI version rejection: add setuptools_scm config The GPU wheel uploads fail because setuptools_scm generates local version identifiers (e.g. 4.0.2.dev0+ga780d0e87) which PyPI rejects. Root cause: no v4.x tag exists in the repo (latest is v3.1.0). Add [tool.setuptools_scm] with local_scheme="no-local-version" to produce PyPI-compatible versions even on non-tagged commits, and set fallback_version for builds outside a git repo. https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk --- pyproject.toml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/pyproject.toml b/pyproject.toml index 20441548..11efe36b 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -51,5 +51,12 @@ install.components = ["python"] [tool.scikit-build.metadata.version] provider = "scikit_build_core.metadata.setuptools_scm" +[tool.setuptools_scm] +# Produce PyPI-compatible versions even on non-tagged commits. +# Without this, setuptools_scm appends "+gXXXX" local identifiers +# that PyPI rejects with HTTP 400. +local_scheme = "no-local-version" +fallback_version = "4.0.2" + [tool.pytest.ini_options] testpaths = ["python/tests"] From 3c876783a5cd88d30c13551f9f4e30b0c36a28d7 Mon Sep 17 00:00:00 2001 From: James Le Houx Date: Mon, 30 Mar 2026 08:15:05 +0000 Subject: [PATCH 06/24] Fix GPU wheels versioned as 4.0.3.dev0 instead of release tag version The CIBW_BEFORE_BUILD sed command that renames the package to openimpala-cuda dirties the git working tree. setuptools_scm's guess-next-dev scheme then produces X.Y.Z+1.dev0 instead of the tagged version. Fix by setting SETUPTOOLS_SCM_PRETEND_VERSION from the release tag in both CPU and GPU wheel workflows. https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk --- .github/workflows/pypi-wheels-cpu.yml | 8 ++++++++ .github/workflows/pypi-wheels-gpu.yml | 8 ++++++++ 2 files changed, 16 insertions(+) diff --git a/.github/workflows/pypi-wheels-cpu.yml b/.github/workflows/pypi-wheels-cpu.yml index 9fd6762b..242dfe47 100644 --- a/.github/workflows/pypi-wheels-cpu.yml +++ b/.github/workflows/pypi-wheels-cpu.yml @@ -18,6 +18,13 @@ jobs: submodules: recursive # Fetches Catch2, nlohmann/json, or pybind11 if needed fetch-depth: 0 + - name: Extract version from tag + id: version + run: | + # Strip leading 'v' from tag (e.g. v4.0.2 -> 4.0.2) + VERSION="${GITHUB_REF_NAME#v}" + echo "version=${VERSION}" >> "$GITHUB_OUTPUT" + - name: Set up Python uses: actions/setup-python@v5 with: @@ -130,6 +137,7 @@ jobs: CMAKE_CXX_COMPILER="mpicxx" CMAKE_PREFIX_PATH="/usr/local" CMAKE_GENERATOR="Unix Makefiles" + SETUPTOOLS_SCM_PRETEND_VERSION="${{ steps.version.outputs.version }}" # Vendor libraries into the wheel, but exclude host-specific MPI and # runtime libraries that users must provide on their system. diff --git a/.github/workflows/pypi-wheels-gpu.yml b/.github/workflows/pypi-wheels-gpu.yml index f7023bf7..0b663ff6 100644 --- a/.github/workflows/pypi-wheels-gpu.yml +++ b/.github/workflows/pypi-wheels-gpu.yml @@ -18,6 +18,13 @@ jobs: submodules: recursive fetch-depth: 0 + - name: Extract version from tag + id: version + run: | + # Strip leading 'v' from tag (e.g. v4.0.2 -> 4.0.2) + VERSION="${GITHUB_REF_NAME#v}" + echo "version=${VERSION}" >> "$GITHUB_OUTPUT" + - name: Set up Python uses: actions/setup-python@v5 with: @@ -143,6 +150,7 @@ jobs: CMAKE_PREFIX_PATH="/usr/local" CMAKE_GENERATOR="Unix Makefiles" CMAKE_ARGS="-DGPU_BACKEND=CUDA '-DCMAKE_CUDA_ARCHITECTURES=60;70;75;80;86;89;90' -DCMAKE_CUDA_HOST_COMPILER=/opt/rh/gcc-toolset-13/root/usr/bin/g++" + SETUPTOOLS_SCM_PRETEND_VERSION="${{ steps.version.outputs.version }}" # Vendor libraries but exclude host-specific MPI, OpenMP, Fortran runtime, # and CUDA runtime libraries (users must have CUDA toolkit installed). From 1af3272dd99b3f5cab46f6f530b8b960ec889ff5 Mon Sep 17 00:00:00 2001 From: James Le Houx Date: Mon, 30 Mar 2026 08:15:40 +0000 Subject: [PATCH 07/24] Add pre-existing paper and figure files https://claude.ai/code/session_01RKnn97qiD7sbCeABHH3eQk --- figure.pdf | Bin 0 -> 39520 bytes figure.png | Bin 0 -> 280911 bytes figure.py | 112 ++++++++++++++++++++++++++++++++++++ paper.bib | 166 +++++++++++++++++++++++++++++++++++++---------------- paper.md | 67 +++++++++++++++++---- 5 files changed, 286 insertions(+), 59 deletions(-) create mode 100644 figure.pdf create mode 100644 figure.png create mode 100644 figure.py diff --git a/figure.pdf b/figure.pdf new file mode 100644 index 0000000000000000000000000000000000000000..4a6c32aa9bafab4478ddf47a32b3e8a86a8d3c6f GIT binary patch literal 39520 zcmd432T&DF^Dj&W0YS1L=s|Mk1agj&Gf0-4Bqu=xMI=cQ5Xm4xf}muOoRlCS3P=_K z5d_Iu5V^YteIMV)bN_O``fk-NYS=kDyE8rAJu^N1n_eajS$QrfHv*rjcoI@nhYyAD zLR>5y@I^%-JUYIv)({>Ub5CVKyEB9j{E5$_ZD(#}?`#X<|5@tp zVySKI2{8him6ZobvG(zV@F+R~BuM|f%KW@4LQH-m2l{6RK)ONd-K_!YPRQrcvi5NC za<>Hf1JVz)uWD^&Z!YcP1GLBs{0Z<1atrapgdhlhUT#641s)loZ$Jee5TO$arJS8z zfL~DX>EHPQ7XM_kcE z??!rU3ePBu+$u92Cb=_t>p)+Z>is3@v2i1c2Jg#5hWo77ucwy|kLYpIY*f4;C5|aQ z(&`%TIrmJH`TQ*?zhVO3Q!6?&2VB{RSGZYz&vu`m%c0PH@~|Z9nYQ8iOAetiQZs_w z$%YPW$_>~VO9fdrfqwQ5&l+Q-lF$~}Sg$J~hG6=1jY)chHHK0-ku^*guP3j|kg8l( z)A`DpAKmt#hN(%2E53uXVxmbqyESL& z$$q@-+X2RBX`wXb-i3SR*d*Bzniulpxx!0rgU>%JVE^Q>n07Z~ynfu2gHg z1)2P1R>F?qvSGCYcVefncII(flW0VBQVVA%FXenbC$c7g4&JAB7{EasXDkw9oMy2w zhj!8PqV8zHTx$k}MP>HWO7ALy$J=3eg4WL*BP}mO;)lAaO)`d=jV>%WDq=-H4{s^G zi${A&v^^Kf0{eh}xwPV;ibfexYm|4Om;VQPRMV2TWOn#_%-gUFRf28WJt89Vm_>44xQAxjJg!dQZ^ zk_xC!AkIU#3WMBP(F^#!+ZL=PxP1(aEPD3YZtpkFQT{i!=3|zew~|Q&U#ItzVcrn$#C2JJ z2p5cIDH!;c3MZcupghV{c9W%#Ffn$0A{*_F!P{Vl)=9mC8-9T(a^=v3M#hSij#aea z-1(3G+Bdd?+c7k`r)Sp8cUJFUpTT)C-HHZhxoOpg;6C@|arf@n<1`3mw@xCX2LGLs z^m^*gv^UpByDjEuH8Rl8_rlSF4EBj;1(SAW-$|t1!98{zmFhK1F!8p2bA{Dgo91!? zErQ%$upsFhZ^GMb zpY_(N1vL5j`Y2AbGv%PK2>tShZA1(7Z_rQYNkTv(vU=V&ddEK7zu<7=+Y^SMP?rEgH{CeAk5ut95h{jDIfE;r2we;DlD#_p1 z6@9k#zMEY>QHo6oWwb+c+|)Luy6Xh%nq%R{tc{m{MP2+{JD-8AGvQ^$Zjs(7XEc9} z<;!OgiTu%9A0J(m$Y|a_vk>69+e6o>J@={@*I?{}&E)T505q@b#t7B!XNFtA*>SpD*mowL|-b;_43y!+nGynHvf?Y<2%@HNNPC$!Qt zrb#QR9nzG!c$RcbF^0ap7ew{0BFUJ`$NP6D)9ZsyJ z5EeqUy&phabU*MO4O3b8a_K@&mC=HWDE8o71b&3u#oa-FjE|<$ z)RXTuS{Z`lsib+?@A*A(jq;q4+Dp65$ZFn@^^TZxLL??jQCi*8J$Tfh(XwG_RE z%wpyvtX;bwpY6(8Sn*Os-go*xC349qQg`(oH#wdQE8fhd0tp)WnxpQ#xxujzrIoKU@=}I8sjXQ^|ECCV$34BZ!&LU_wuAL^w+muFDXl zAgFVOoiOX`6Fy;$Pa}Jn;%mieeGE|%lpnv$yLN6lVau3PzFZEjhG*g6E^Z$WnPfOi zYrRE$n;g7uTNA^apYL}o>5kx8*G=;!S&Dt19uX(7x+U{QYDm9JQ>Wa!#18J99T_+( zQ`5!FG+4W0y>p3x$%Fwk|9={;KQ{tMD;KnUVFGZ(uiXLCJO*zXK}`O058M*`%S0A} z0%q}F%ww@gDWVn#9@g3+e#T?G%g6RK#M{Rs?0jNqWH4%YmULZ5U3}S~ZkB36wLG!D zlGft`O3aj0Rc)fM!z?-l%ZFo-p>6bF&O?7xT3$TvwaZozq^oWh=|~VR~~XYl)IBm@@Iuc^gmG z+zKhGawpihLqyU4@cR0;$^HYQd`r%SOY9|YB1Dgvqc@3-vmX$Tq_G#b`$T!!6xCC3 zOiA4L`+l~Ft&S+%Rm(TGAeLU%@*Ulq`!}H^?+8CRWa|YIJt#{NveI2YYpn8)j8;`y zKXijW`(~K_#t+R0tT5q2xV!WU!x@?gF6wndPxS_4Yca#&HNi!-Cwg(yVr;c}p2THt z&4J=&B)GdTtxFazemp|M#0((6kcBT%4%HzNh$%zsGf&ocRr`9r<5kt&+wSQX-j$lV z7;6YB;#;P79aQ^deb|!_DzJ-1!C4ghBN!a^cg}KY)H}&S3BQVAQwg>lOl%pS%YCu@ zQGQ5St}R=3z&bi=)F+(&dXDi1Q^b%!z*R#@3O632%T@2!D3*Na_QS_KZm&10I3+nq z2{4i_Z2J)~-4$PwEA^bRkFR()$V%rWyynrki{1}CcQ$NIP8r6`TBN9>M8M2hYc@SD zWbNMDon4j5M#k2~wbCW-KV$4{H1d&dLC>^f`9UCGylH6XH*pNPKqyeGSuGwxF*K~bLnVU|6ss!VRf zWBj9BD8ZWxQdYD%C`bc${r?i20OIe84zJ+V_^6lx@5+Pw>8V_ja(u;YC34MChxP0H z-A89@JOuC^aIefH_Tj8JdSN)hvv1|>Jl?g7(525zHSexj)AfJws*cQW=%xR22KGh1 z1Q89oqv`9|EJPkDDhICg9Z7sQ6__nZBBxtBM|g8MidgWeg0y<>Slsp8w=Z~SkRFj% zap@6EAi5F~%^ajyt_;KeY&&Q7B~-vlUVV+?`!iyK)%~W-eRIy{k+p+wxJJPy#whp* zy0HGmM?T)avk}fOpan=4i$tCh=TE@`64bXRQjsINigT3)?S z6AtIUdZJS{A^+X9wG}!EpjCvbk#{&B2xIl2DFjt5w0M4YwRU}ijy5B6ZWRR;pp))j zR0y4-|6%OX-hmZrpS6V) zR@F^T?JLOFjc4Kz!aTV7OkR~Km<%J>`ffd5t50{XGhKzzb)ldTg^2-!o?`IvY1Bqx zLJ50yu`lqpT${W(%**X|yy#Zo1&O^guZvBleT@goZZrOT`_X5bhiTg;kNo(xb68m# z?`2Ev4ZKL{&Rv}g$b0hXiNG^e$%54x1-WKPxkJyvIYOG4_7cT^OplzBUn zLZ=-Cdr+7+Q2tX)1Y*%bq%!n`GJa|^40i;kMJ3`DrB*-6Z>kUI6VJ&Fs?<5kDGn7H zUCi-c{LtX{&c3OF)&2Fdvb&1n`g?utocTp9e`4X>nfJCNe3m}fqWpzbU+7}q7a1yA zT$0N#kl~p8W-Yv;tL7*p9>?`#e3Pek>h`L#qxy!i0?FtKfmH*3l-G48rr9Qu>O$Z7RD>xn=vBkw3Elz z2zZle?OJAWN{!?`pBGQZ6WOfb?q|qix}BIQ#z|OaeED4U=ca7(+sv8;&uCQMci$0O z(uf@!jBiY8eK^UMmeLvQ7q(|@6e`yKV&mIWpXVbDhT0^_vrY(N4aC#KTl$_vJ!stR z#tGv(dKb0Io;+q(Z+4q}fbD=UMj{ym7wclIyeZ?N>7w+UlDJNB-@TJyvYty2lO2xz ze#l_h!p$b%o}ryV+dF{Qp?f@naFDQ~9Yc)&4Htn})4QltVe)mHRhNe5t24m?$LMpG zdeeXM3y#9#fFn+^sPbuvI5Pnz+5&llB-&Tou?62&i-_f;2_`-TCh2O0``(9b(jrZq zr4{cM_p9?)F^SRmlT;+;Rhe(@7ss{~pX{6LHLweew%2f-abeC*g1lNVdc&x8=AsXI z(48p7iKfz7-ZPv|*`sNf``=d%&ZR451!!t%2*42V*ujTx+_fp)JViWZXSHbM?;ZyN}qh{=f0n?7=pWX-QE;h4pj9k(hH0 zfLNSz^c*jt%$`s%_uD%=6`!_ra-b0@hb{~6l9%KprG`Vv;a4Y0l@2#6rv`5l9qw{i z8Vyx8UtJc|>@DwNxlp$3B{xYqb!*~Q3GUS;IZY9-LDBe&0>mTA@zmSTutPp~1(k4k zh*R$hT5al;?2_+@e_+S)>yT2Nmm|3~)qLyYNTxDu0CV*|j?jDmb?<~NjUM-R{p??y zG+k?Wwu^n=kQm@zOKjJR%{NNvxGs{IQsca^ea4K4g+{1%C{6yIyybQI3f68M8ow4- zx?4J|H}@5fkL}=(epuX^BKVM!slvS)=UUhnUr+3AM4nDKUi9KUm#vpk$)RU`&=)gT zXEtU#_Nhdf+RvnesY@-1tDl(0oF`2Vzthfg-4mGNU{7<{_1ZjJJ6Tk0lXgidLOi;` zbz$L>?>WyqD1?B*41*y~v5LuQ3OQo|^VAW(zZu%s%A<@klx+ORE;0@*3UY06tXigq zFzZm`B`sy^_YZmKnzYE(B^ZWMog^}^w=mRmm;->H34+RkyaBc==^ z;VpyC*&8z#XT>H*7OXQ?OZ_vbiw4;rNElq;??Z@BQ`s3CiZ`)9-qaGm&*G7BRm|(%)NLaf z9eYMmvq<`Hf><*?g$&OgFzpZUy~S=0`tJC>m|P}|VOPrTGVI*EPs#eQhFh*}Tp(Je~uK6Zt5T<|NY zvDDa%pF*t0r4Vxj-H&+Xp<%4-gDrdB^nDb~cPgU7+Gy=^2B*@-*lN$)!Al9=^<_inMkk18a5t`5T?06R(zElxL*rDa>OjzVI0R zB5ha~_oFtn=I9ohfxmGh3bvuLNKdnZi~VQj3NS-Q&RloND}b5HKJJ-<0!;2*EkE<* zI;qUK?A*)5$m}|dwNU%%i%&Pr(Z`5(&YTIdk?wUpD~3T5%*GM_qH=>q{04k)iE}?M z#L-*&Y<6f-`0~7`f?oc4>opR^-|sM163eele^jo!C!UfDe_H5E-rs$jWG*0j@K4llpor?q}*jxQht z_j<#^5pT#m;oIm|-=rF^OsljqU6YKpcRpvy?DMwgKi_|3qGVuF5cfSz^}S5xl!rh6 z)Z~vN0+&=H_CI-uKw+gq;is6uRAZz_G3fy}a>HE~w7W0X#&jHr=Z{AsQ~3Pc>S&+P z@U*twoOZqOhW)!%-3GkyrfdGJZ~R#9r5Kczn+&h!r&=o{m za_D%>sE|rP|$$UTq(w<(xwH?cD$VVnP-Ga>gRmykGmP(d@Z$g&lTRCzwlw*ZT|RvmEj6j$<(yM4J^6O zLz#&WY2J|If?E6q)%K=|@C;N|$o#S(UA07sM1LW(?%w>9WW;U(oQ$pL-%IxZhsW*@C*I3vg?~scbhKZFWan)7 zYud+eCmXWy-Q(PJzKBtoOUAk78CzrVI`~*DR#P~Err=V9=}qyr=hWX(P=>+;I-Ms9 z6|er|W0d?aPn0DpPZVHh{owpEsoVF3)H&GOGN2it53I*bPX&STQRsT-ils2toGcq9~7*81;UPRh+D)O$vsd8IV`g$9vpo~YTqM6?THPv_Lv{`jcYky+L@2HzP+^II#z7E>(`a8u$G*M zGtJPE#O|8)-Wq%l-hA=wq%coocMRy+Ir-vq6!ksfXHwXkB&%A)J%`;?&IC*u)IzUp zIFKCbF1icP+$6c)wUPCG>B@4wkAr)z+C;=9#{8Fs$K+QE6Q;$_M0;M*(8)a5dN?3k zS$sCO(Wk0%w;t}4F3H562LkawJ?5h&(|lhwdRRX|{Zj<9WEVPBgy#q8;W{F`&kK3NM#f zj0oUGA(Xk1ewPZ*>}X}{1)gy3-Zc&;J6stoRf%`=;ephG`qU}m+rd=YJrjoy3Pd+A zXE6ES>7z%d>_{1{Bux{z66wc!eT*y5>t(|rMxw5`5hUcXTP#6i)}HeL-wTEJBo)GE zpE(BRw{g@cO5Yct=MIjsE#FaPD-%wb*56D09L1&Jt@6-U z-l0Dw2pG(N^zT$FP!tmTlLcW2@iRV;S(O6!TrjJxQtDa67M;IHN;CI(QXt%Jo#);Q z_J|d>lw4%4?jP3c^rWFi^fz?269_p|v1`5-f*my2;`9Y3w_`>DN8&JEjFr37YsaL`>lW#W!0ydX(vhE?bb-_>Qicx`HmRA0m7Ul{;dMwVOP1 zpXRE6TkKtX)E~A&>Ok7W#L38PE8rVa!a$-u8Gn8jE-c=@yWDH>&hJ$V;mA`~v3ZO5 z`){HH8rc(^ZYJw?b#Pug>L_G%*$k5z=7#%4U*I%HQ!bsHZnt9TFSN+NN&0-J+$v=? zdi+It7PFJ0Y4|6F=_W*A{nZJ}t0HX-TWdHEso><*CAp79rv>wA4W5LHWVd~TJBL|3 zz<6d8Ih8%m3^gQi%92&UxJqKMlwO;;x^CQHpleR;z!o6cHX0q4qcs+t_bfE})g=G$ zScUnOF-s4I0>;X+i2KZp6h*JsE$q$R$J*hY3a`h7aTzh`hYZ)ev&d5H z7`Pt3lTWxqab33v-nrP@HxSsPWps%FodAbf?Z@pl+gmYvJi|o+G$=%b!ajtaLJ{Ho zWg5J9+R0@AoA@F%CJILc02}`?_@Jjb#~~LZMI=KH2tVyE1D?gZtx<%@=%I^9*YU@D zg8h+dI9qRmpUjpHk49`7DTr;<6ee$Fml7wlvrfvuUaRt0u5`Wfql^9MtHlMOGv=#w zb1IDQjT_EOhjHtM2R!d;)9FwlH8$4wHdB0sd;T%!y@ZQh<1=*kcZMszWb^p{Fo3{R zP_PMwsRiRb1zf1ADdmhsPgoDklx9q%81!pn4)0|SkFPt~$S-v>Gk2V^kB*-78Ee^$ zq7LV|UM$Gbe3|_Sk+~5g_pSX=`-0@YsGxUeJstn5QuEs$=U0n4rUGmPdS6*DJR(rL z6908Jxu#(59yW)FnT*=H)!d6X-ef2HoY<^keT&t|vxmYF%1Ju~gpH!X~o^KCrzdi-bDn=9_|UrwLMIvXm?S!8Ii zs>t-!Vz1K3shhd3)i-tf?GbwK%L~u{B#V+mpizS7(LaC;TZJj5U4l~bbG@rJOvV!A&dzd5n! zb3vDIP5p9xW7DEHcmJP#g8n|koP;7F_0>hyG3BUSv3!g{U-p0ln2x0=GmW&|a zT^h6n*(dvgen1wM}@ga=?hz&RTrFbbGI?I1k%Kn^Is4&iZx@HjzuoFP0e z0962^fO2=B4-g)JXfFtlHxMTU^u!0ka!4u27*qq(gIz|Iqy{f}Y=Sm$90Mv(~tkzqVwGvE&{80dDCF?Ut4 zMsgj{IsIQ*fT=+AoV}%#v#p~wKs}GPr?rzFm{B$NL3RsJK|p~2#M2==3HHed!q3b5 z|3||A|5^_mAqYVLm=GTf3K0an`usxt_)uP67(`G227&YP1BC*}#0P^TAAk$;LJ+{X z1+Q>G(|{NLcM1RZG6c|3FoBKyN_;>aKx05C9x(R{0Ee7_Q^i0bz!BhlAD9G@R0-h2 zL1KU;01P~S06-D~v{Mkc3Ih8d0XXm+cm>9&0H9s`fI15T2bIA(p-^5pkbwdPKm#y7 ze6Tkl)!-GRf)8jMO#DE_0>B9669S$GuKWT3Ti_@61c1uGvk_q82i^s)0E++@CRo5P zD1dwcdaaQLPir0JY-363{mCHxz~s zYRIpIR1Tm_04ju31W+0NodAIYgaZNwHRHF0|4TD|T~8F^L@~hp?+GOCS2?HvzY?e% zfCAxz*ME-C|8B5KL7!u}_};ZQTUM zBC=_3aD7JRd;m2%$@v1~51IRCWdO$7A6qUMxMKe`a6(h+(JKhvgdcu1;1T(!kRP4+M|rnX$aP-$L+KBlHdj`(8Zw(xEtsW}xrO6vYA3Hp7wjdjv_ z-AZcjxW|%n?_o{>b|=V9D0)|u~Yuj z+T;ht^UIX@6}K$}I|XxNl%s(ypv+!c+m)_RM%N6QsGFMp$18$E1&k2N);BpykkVOA zo5>3j!ic@tk2A_Kb)Ay|Iz!Tm5CGqKAzvO}t}eN+y+G(~_$^zV0yxvb&siUDz{{3L zj!3E#7`m6P&`OHMMV};Pd+kuKfLAW{2LBEp(caPQ)F(J6p%B$OFB*m0TM6RX6776K zMhV50*RA`FbLoATX7A;TOW0A+%vtuyey9I zpc5Q2|DqEL%)fspG)5#E@IGOAWZq>sZznF^Z}g0LeM-W_3=#iA*-U1vW`H?d_!c~v z_F(2!0M7BFjPrE_&t*Cun=!T1t7q97u-9%7uCK(06sVDjcD3PMu)@&#_+AQ+?M7Ag zC8iJ1Z=BEVRd1B}*ni;j-sA8ZxgLH-s2uAX?SQ1N?zYUvb14T!<}BwRBRnR;q45LC z0hY52Lqjg*5~a%Zaq2fTa)rol<}<5{UyI;BIKo9ti07f`IyguCtLvai{?~aRN}1>m z7@U-Lmmuk(df-&&d8(ZsMkGZSHev0(`ksz2*cNt}zlgfBKsMS4eBNO}<@uDCNR$sR zlOYuEzJAZC{NyQ|C2-*Tw|Ua3sxzVk=hI}5yUui;r5UHS|C3C>=K6z7I50K-oz6IA zcL2bW1rCRBPqS|fE{Lipc`hviP6QPlvi~O z7}EEAh!-`)dCTGk?hRW8y3EuuPTCikR}04qmd<@HqIGyY^Fa|N3O8P&j#I08kI|B% zu*)d$wTotDvC3yse`~`A8kNH4rZJdeES&$Oi#=ykibW~+e4}0y&)0g?X= z9;!Nd*3}-E-J>2v@@Axo-Y5sk_RdMTG zQ96@bcZ6!H53hY*IB0?`c54-T>TcH#Mc9TPoU4#1Rtx?x>ThX8+!0@RCPj6~Vd-I2 zbtE^Yf9!y&;|_+WpU(8;^PCs=ch8(_^}kERPUi4_G%ogmT}1J>Qk7q!`Y3r;<$1leTQRBK!(sbEuGWerFU?-+ zUP+E%yJ$U24w(v>YcB1FQZ_ylnvrr`x;J+l$}rjwM%UQ3r+v?h+!cV1aee7j(xVTe zOy%`@^!&#DsITS9A(iY_0E+W1z>ek*kswavll=D)({$t!)2GNIri*frSC3flZ}(Q4 zQ}0q(8hk{aG5wlbwd}E@kUBASbtSL3_~z&conFwF+7&PB54rW6)e;(N$BL$j{RcN5 zv9NgrDBvU-oxfpJgSMMBtDJPVA-$GB;gXZUhgGLl8cO;vN>6IO2^+`N9x9q%rBxZA z#n`b<_FQFaix64eqI_huNH&IX>-t0knHVR1=8~rf8RmCuDCQUjw8OxaOzd&Bs#D*L zXOO?@7Jl&gl%m{x_PE;$R#ctEdJncA=iJH4t@eIE9%b12gAS4J`0j1e{(JjYoJ~#x zo!Afa9$M{%Xe9o^e0c~OI)^~wcXfp+hDDv0dxsOa}@ePO!tsbJxJO>JOq$GhM@B zHQ?Doer!_t`Ho))Z$c=~*>F>-@P_uh#mi+4?Kv;q`f%H$m$iz_*QVb5;rSdUb90}EypHvca`GB{79UT%vO}b;;Zei7dD2k-94CJj3ty~aG!wXp`{HwJ#3e`i zXE-D{4DNm04O(tiAPJ|weD@i%vPDl~-Q3hIJKIsUPO{J}p~)W^L?$#A?I?AXK}7ty(d-W&hp7+HE6V|5e1r7H zuNT+x7A9+THk;j@aak%O9nua9Tv!5rpnsZi5k$dv6lORa*m?h*?{PxTR?u@YTet7m z-9dk!_gU15khyd%^FY%H_iQ;6h1_OQyHirVzmH*zxmc+GfrovazeLsG(>#B%_nl8` zbjx9_WyN|EU%OZIzLyg?_9RhVQ!n;6AZHmacU4od?GzY%KOd3M72%8U7*mM|)N+pS z)z~h2^eBK+L&?@QJwovOggdP<^P5?$)k+?xQWo2%W?$H3&RzW?QprG>*2k~ta(AhT zQ>kn^a_J0tu)7cU7U|Pw>r3To1LmC0x->G6h!Godb_40Kp|pU~kaJY(!lP)I0fAZ6 zd&dUVIAas8_tI^2q!tkUKA-4bM7~YNq;SxCX&l$4ntMO?=Fr`5_-baHA1S&6Xuk=? zi(-$#u&%(IRy&V^65rumzXb3L7(y z`BI?SROc6)Bc?Y&E8S<&WLv>u}Om_;r<;S1^cE5>YCa#6bhhHpP5-!<96Tca_) zhPz%k#|?mHi=pnU##vbKoiB@TkkmBdeT)Jw-(lizaY zy}-V0@Bo_wM~MPkxaIC`8?&*X<8Lk!Uv2%5(bK+#8T~0Vz#sX?VhNaRf0Y@$3IW6b z^Wu`&;4uF>hK~&Hwzjh&via;vjjl+*0BIj;A!WxJM0;3?z@d52ZjHy!ikl}9$`QnI z7A=vMbPEMNC@lC>?FoN2|ndvd$2Xqeo^}b5HNF7h9H~-C{ArIVYo``b*n&YO}G*ui~DH4ISi#eO`Ub zAzx#U9+M$7Rxh0z)Wb)Ooh-we z=_x2b;A4dH@&n&D{67UOf$0h9BmIR={vQFBV5Ebmox8O)0GWVJYk&X%iUjRnRR}Py zK-3Yq1Hg<(ToDRDrvRh~qBT6W0Js7;IzUVjcmfG1LV+(n0*E3M02=@-5elAe`3*_@ zZvc`1#B=+bOZ9hk{6}!80mT1_zeo%b07sB`<3AW83Qqsu!4QE%7r$YM$bN#`i~k2O zL_i=2K!U(`G?AbXfcM}FfruO+Xc&N{z=0FZU;^scj zg9Fck=>+3Bff4xtECLBY@e6|B5C~ZTKn~!R;s>A!J`mLd6X3*$3Gv|r2|&gGln+Gz zfB`K400RObMgyXR;1w(f0XRV41c3L!I)#8kAp8I@4&DQ=$j-n4ND}A}NC?;|F!2MR zF8D4V(D9QxfM)zGe{l&}0|EeK z`2jo&sSAJ}{HK2KDexvznSh#*eEq2$VC}$fd}P_r3WWHDAiol*2q!uK>cM{}B(DLz z0CnQGgwzL6Cw?WQuKd&qP#=CzAYI7cKpmiN{7Oi@;01tO5Vi(&07U}fYNTEO6Bz$* z>6cFYx&pcY(1*ZBUcvlt2`TH7a`2h|Bv3Cvse;%4V28-5`;qA3)bna=ze^6!6|Zvp67)Ka7gg1Wthm z#(;ov;2`h|l6}EDKf`mvxTq$@7Q?q3&8*Tx87eYZS_r^`_6Lcu)7bc8TD7sj0j-`I z{6{UelD$YIHd+fAXm%W5jmVq(UM8fa!_9Cz!H4$`SP778zIF*xoWrRcA6PD+j6HABrm4RiO&Y5+}KKdcU zuwE+RXJOywCf~IpbbVaEvH zh@N#3gSl7zN;7kfVfeZ1ZTR-M{qA~j!i)>a7>V!=0~_&VN)bv8)t7GGbBG|RFni-G zJ&U}9qF>I`il2A0%r|&$q44#CR%6r1W+waVE5jc+Yt+Vuls;tJE-M;)o2on6Upwzb zE`S|NbV;NE_uJ)pPHy2?=iOAc(1cuiL9RQO!n1VxiAqSA18f+5UT3JS3raN~e#k#HB_nVKeOU4^p(V6WPiUdB;428Gb&UOb4mLXhSC?c=pl`w4Msba*p zylhSJAtXT0H;=XC{;2y%H$UNdN>s*3qove)`$WF8PUC@_&tfFn_@;8!X_hZa0Jm=o46r%{#o8ov4H#7h?sECkcO$@;k`F@l}9WiJoQl1OTpwq zcqlrHVncMA=m1qPXb(aNei=*@WBOQJ^WkM6YUnIUR#~PeOS)MozO$P?VWkM|v;86M zB>yiLN#is;KFI3Bna1zsUOVXFZBNqgsxA?@`rN&Va*{Rz1!2G2oIjD~)1dfpJ^>)` zzO)jJG%sZS&(BBos`TFBlu@>N6kbOEz}mt*+CO{pz)k8h;?c-3{_=m!E4j!TW8zKuA-~+!12FTPVkNLl3YHeo^xC8DOFn8ccJ}N9^BPF;@Fi2qD zI%if{7A`b|g+;)KTg{JaV{~$odiWtNAquik_|(D8*I&shNBmT^kVhy~?YA&m;7euA zk!ST#G>yV?dGi~k z7gT6nm@h>Hcsh7!syMSg3%c^6u<`Q3{EM(!?3A35#`5C3wrjU-#Jj{am!x7vENjYrCqE+0y0-92@!G-UJTcP@3ffWF6M)tDSGMs*Ns+ejV#)2P&EU~XP38|S zO0P9@Db~&S29;!wzO|*~ldFDH=~VlsH02V{bMBL*3DL8qplZc zCe@iH?p#iWp?q_&!y5TS}tvY~8FVJ;*aV04?3%@R>nN++%yoR|(? zz}#7+TmF+mVB7E~pH73Y{eEO$Fj(W~M})X!Z!N>|mT#83=BkkGlH3aXk+wY`s8QLa zLTxu*(KO0f8?!_${Gz9tEVBy#SP?&~#J7mxn0_Rc&Ws_*~fWl5H7Ly>JH zi7c})j4k`VH6tO)zC@B4k`_yZkL+YgSxdGgr6MJ=%aRsC31vyNTOz-6hd!Uo%=eea z@A3Qo9=|`n{&BpWd(U#uJ@5BD_q?7Lfj`-Zlgjh6d#~pE#-<)U!W45L!g%uLpiRYH z{k&^%&20uEeLk8gW;ewZMFWqd@2?cUEoPkV)!kC_v^B0=B!b6agP~dXEw|d74+XOq zKc9P`7`5eeBkL|=|C??+$2XNaSJeo^>W1%Ed(*TkWt!r;(+wGoWqZ4(o4DGQ<@4g& z;p#Y>5D(|L#qpF}T22J=wxpWhzMKqE_r@m=(Ed+Gh3sw2qp4jsD$WrJ9ytEkK`L>E zoON($8aM8q$M9Q5(9?@W8QWs6i1Z;7)}*Q9-+rxeRXwED4@Z%+fR zeL-+cf1!P?uIfas&D5MPc#2$0n|+qIo`E(8zEQqt@Hl7mXJ6A)nrB>L&1cwAewgc* zh>dyt3&r*>X0UtVVVckEl(OE}xT={9Zj~mAd*{9kC7iricY!JWPI#l|ExqE4Z;gZ} z5Igie$Ml;p?QTB98U2OkuZ%pZ8-_0U9II;aczM3II=p3GVb~m7B;gT{;Yic=;CrYe zIC;Gzx5%y4N}^x>oUtt|x$Z*!l+{4<&|lM~Hp9;(AGXcZ-G0C?v~OTcqOqqXv=^87 z?g{o=>G`gC@0(F!0dI6q?9lV}ErRd8{Ji}oK41~wzkPm@_Upra0@Qkpin~RxWfd6^ z>YiLs-}wFkevIal)vMOy8O?dKge44(61U`a4Vou33~7KOT)5}jtf(pCfY`@l)ymPC zja6~hAIzTC+oGp&)Q-3FRxTW;#1`xL`jA0O2NKxnlBU5 z<{LjVG@nr(Hnepq^YxFpFAF~z&h0!cLhbTU2|J*b-ak!p;B~(_!mLw{ zhSTNj>S%~Wz9mJtj(Gg`^Xua-0tB$0V-?q#D|(=??(9FhNnH*4SP=Wi8^hsO@i05ut#fgm%r zN>T|j*5co#qE`T~rVlp`GMs}JRGINithfVeP7c89Xm3T^9Ob}{ow1SRz)fy$eJSVF zygVT7&-!A=rTyPIcHisF5_~Er%_vXYm+LGp>sa_B@X3jVbEv^XMz)@~q7y>O2ekH% zjWOs(ZgWi8GCHizIBrnuW*w1yOq*7H>hsN(6O2=RuMRccU?@m!R}y!*{+@`F_SzCI zEZFq<>c_XaJsuhzv)6Zte%0$ai}n$la%*_^v}q{r=c{Ns!`N$X!e0hUy`D?k-NP$t zJ2;v66uU?5HJLHv>b_f7&BFf8Q(IG<5dUoAU|l=2$9d1q{s!~(g1G}b3|TFD`>vWy zmnipk&Inn*APP|#6kws5BD`7~(+N9Z?5I5gHlfEH!wdPLb35mg&}dt6>d-WPsTE`ockbH4ip=hr6^ zbS?Ob*W)3ahl}P3KZ=~)*;Tra-qAhd5w_XuoGq*@+P5ShZ>fvOW(g%$)Fr_Q=6PS6V^HtL z6yn7$MNNP2lk~%$W_{0$i5BJ_os|d(!JNsNJU&gc)kl5Fc3@UeyJ}1n&d0d<`5uR< zjJKf&tNnJ&-2OuxQHn=WoO=#Q=8G*mS;Ub$aXC}4mKHQSTr0CK`C&_tt&G}#lLvT4c{o#C19@LNEzL3f47c7!ZOl=LPu7Z# zV+BuPwA07KGS%Q}{v8&3+4`0a8us{gGYd8cdG6T|EE?_~&iAIPkL|TKhTC;G;~-Al zK{(aaI}?HDj}|}n*)cZgXwo6cd^i6ChV906TcnP`q%N1FB=R1)GnF!K=-;Hw5s|g=S zyqB+;ra8LwqJ=dKOUGE6ka#}Tl9>$|DkP~g8!t?dU*tQT(_3W~kkH(_lsdW%zh@SA zRoX$rHiEAso8zNOwg;zGS@kEZWNsSaK7&x$9XfROddE}xsJmC+B@4q$SbX_et@CQ{ zcsSGzS$8Q%_j~=&adF-ex+&I4qn{@z0{))n29IPC1_MJ`wbOps)V#^s?nMFLo~V#r zNzWOHITnu!+Xll=?h)_0*?%H;WXzur9-Imi`CBR8tGm{H78#0^S%aZEF)` z%0nqh?Yed1(S%p<{O(rV$HnHES%Ffv#|t(Y)Vh#LV78{kaqYOqqx3*RKV3*8iU?u? z8&JQnz5iKy_G`C#0FMTXgK6z`n|(*^iFL8JrO=xKb$O>6=24FKo>{`h#%NjMBa6E7 z=Kbn<3%`y$bGVk9VDjRoyF3IBz1$aS%VE%`DxHrq8K|bUep8)Xr1Ellx1gWmXKJmb65*|B zuP+7-LUcr#IRb^*gTTnrIKZILIBVmt&r_%s`rOHCA=rUm(E1TQ4=ajV@zkQXHAKP7 zTaB;FuID_v89X6SXZ+n1%?OEb4N8R{X0L~D)+$Sj!s+*0{B|izo}N}Ji7(h`Yg%NM$>iPn`zr6h@X1bb(0via;Zcts5PO2;thT1;9&3H zr4W&zg%RaM1Jc&NP3=w?zm%(sp#~yXI3-89jw4Pre}3##F@6kdp~OeerM)w?dGA;m zO_37*QJP2Oq4$o?BX_!$Hh>LYw%7uvqq*mN&3z@jjXILQ?HQQ!vr0Op%j(S3Q%vCZ z9BesU$0Me5p=646CK7HchS#nC0Pp?CAQpWpxoDU7MbGbDh6q-=MGaW$NR--pmf(iR z_^%_Jlf%P)qrrAZj&tbsvNy#G^9Y3aU9?#mdOlaZbo8uJ$1UD#{L@3nQlI-Z9&||H zCajO@7r#2{A#MGFSMgOMx^d*nkF)J}?NX}F7){UZe%>+vlU7jNWu97JQVG70U?;~P zLvQUkC9vcSebpdaWLAV$wZjD^j+r^)*yD+->9&NQse*+(eHjut77|_|IzT^NLg(tZ zFplw(;~b&$4YJ4^c1yyYx?As_O#I}AA?#uV-1U5dN!1vE#lyWL7n}6mt_7}f6y=!u>y}OlRE932pRW53b{dg?I zUJ_zb{7$X0)Z!+v4dRcDRRLeP0N}S)+pw*cH31(ATsKD;gj`yJ zxQuU-s1^Q|;>q)mSdX85>*a(ImC%++ye>M^$@J+|c$;4)>1)L zw-5Z}yD3u1qV_%V(b05VO!fm0@5>!BZ^iIE)cS~8s05V6f9xY;v*yE~kA75vy(u+g z{8fSsQOY+K12tP16H?p^hq6B>?LXh*JR^cV(p>pgGu$NE{j*o+!7Au;P}U++BkZGJ=H!U1_|9Y-xd0y~8M*l`d!4zTh17MV`NgM2M3UL|fDi~AW^ zGL|Jk$C2gVbR*Z1o)sCa-^d@Ca#LtQHd;;cOu9&bMm<8(8>1f_u5O{fPj!EWk?<}0 z1A4R1xs>^gokA`(MhM(VY#yVX#_ccj9b5l_eeZVLau&z3l1gtCdpf!L0Pc6+5RQ=v zx|w?o->;{K%|4k6obsCXJGI`f9zojoLjI*XD&f+a$ar$Gt(zeB-*kEiGhJuL4Q126 zvcic=9-{g5EUAcQvEI#R;?HRs*-fT8z7aJyEm7Sra=TI=%U)n6PiN|8#DdPqGaO7F zzwfMk+aFuTB|(}2}sSuNhj%4c))C%H}LWeAB^lX;z4n~L!RCQU6N+^Jih`n>MA zQ{s~i6Rg-(m!ni4;d6o4eRgp=Pu})mzdwQsK?z9`>yx>&wY>b+{bth^r-~70LJHY{<@h~P5eOkQ^rvol*J@S&-`ND*F zyK4<|2V3@T0V`@li%Q^u1bYMjIJC6mu$&Ufz>c4eo>n5031*2B#S0%rM+5lTDtjPx86^eP0;h@@MV+oq4Z(vN8>Z43CMXK^Lt8X&gst{(C)WJe)y?{DC7Ok zctm>#eQ0D^LGS$zyR&UGOfM7~zicuJk$+FErBq^^H3^!eaE3nhaF#xRM$baeGK?sX z6D$3zmnsqxwRKqSz9QeAjWjrE1;3s9{L$^`Hnp%frA2LF9eRCDiV>}GQFjug!cJ&O zjvyZ?KMI*H`XnegSKQWI&d?y!dyH+@_ot!G!o&5vFYUP-?YT!RulouYp76)#)fH^- z8ID>~ttUv)NuAW`3_3WHSsF7qm$2D=CiQUlwSenJN`-mX`8`bu2X-8^cD<`8*ul{J z{aOzPu8M{$>@K>U@MdSXh0e{*ei*$&Hl9af7A*}r!@O*C{t}7HEZC-;mDk+1gJ~!> zL(oTp)_x-W8qtSu@6r5AZE_c)Zlm>G-rv_NORGI@t5E)v1@Cg}xLAiz&eMX8(`BxP zot@Zgm#WWOxe+hG?&XDcG?GMv`!nK;^oSP!(kN#@a$(ut>B{!`4uQ-Vxr{--H8Lv(~1kbZu>KLm1zATk)LP z3ziSn+*5OwcUt@iWu>-8MdD+S6~7k0{bwBsgx{lB8bAW}#-H!4SVLAi5|b|{rlwJ7 z^eAdP3;-g6E_%TnRl>}J?}lWnNPEy;>cechaU&UjIp%s)E+Y2(O;2vYgE{-R`6lc>^GHIbf4EYh z>GJccS2N6rxc<4HSle=v2zw*tP;O=+*TTht-R@SP$j1(a&lw+_4WIL7AqTr4QuBP zbdRJ+h&sZ;KL<~2!(o)N*>9Oo`V{_EK5$`|H}}_tF3qVA^tVKmUsG#0m9S+^0~CPw z!0VG+=F1=EkfXb?>9G+vj^*y)q|HXy_RKnZJfnSZ9)VqgKzcSK>|u+gRy-Bw4>oZA zvDa6WEf}!Fz(t2l+Tz3nEv;Lni(6W* z=QOoCQE}L70`M8%U~EA@oru&FML&CAGXB@$8eP`G>-%mP$notC$059M6SycJdXb*PXSyFkJsc)ZAm1 zt6oi1zA4R)^>W@50p0ij2L&D*!JB+BLuF(0990P}r#c%nDn;FbP88$a`>z~@Jrab(VwmgAwXVZfd9rzbZ`9=3y=jf}CN&lp57T5s(&y4bqKH_&_ zi-z%@J*$K*&`9|kzXxz~zv1_miwC#$MhR{52yk_4gP)PJpdn)aXaqq<_zdV;4Kir)1Kf4XhBReF7AFn z{{sHIipB@34?&q^^dK6ai~(H6a0SRGV?qw(lQ9Q61-Og_z~ez;85t|;SUxOx-uV|; zzF!iWUraux|BlHA`WH;S6ihyV+#=hI{rl|o&ye};qW{L^qqxGL{r>{TNA+6&J8*nx zK$H4yZ}~4L^dBh=Ns93wDa{|HG+_Mv?@DQaaO8h3r6HjNNs_Anp_B%23jg0oY2^NU zDGgAD{GUo`NY3L|EUyl*n9JTw1JW;&{F){lL@A_{7M!&E9tUCgfHs9>(*Xp?<&-WQ z&;x;u2~7OJm686d56$PKlmWN`%PB)h6G^fe0rA;#$`}wYms2K?S_(=s0bbwoUPDlx z8Az@ag9FLs`5FLSAwVnXROTSLQj!Hot~kVfAh}YOB}lFmW(ATfrP)IS&E+#dUIEf! zl1}FclBDt=f*+EEGX=E(0?uEoUBHF~TOLVyNlp*|E0_Or1OHela>Zpra!yivfa3EP z(9n~_i6tF=7$jFp^#aM2Rsu2TGSrZSnO&wDuF(8Qgk0n=yy0)8T;wm%A&Ky}ynX^| z4Vv|fb_lek%eX_P6(^2dW*!2cPEs%-e}NB)Kuxz&6JL;A>F?jLe@LLvA~g!=RDoiJ z)DnM?Ti|*Q%27IytZ?}RI^aHEw*GU}@19s4`MX5dpQwu` z+7p5JSW_J^iJ=g|p7dK)^*7%Koo{7y|8r8PmJ~*)z7$5N6Mo$s%YlHIy(?fB%NV*l zK?LKK0Rw@MkN5(PF7mgckw8)R&(WGb&h`+kc_$z$dwU)sdb&Hn{m{}#gtUSr+>J=| zCMe3t9RB-%(now;#hGLPTG`3h(RuY@-cBxX2YW{kXD_E;rvO<=1-Kva(t`SArU1I1F<5Gd%zUwteZ31Ku> z*^sg*=$<08A)tGIwGDh8g5p>?6f}*l%7*}gPKvsqA&n||J_tffW&`?D@_qq3dnnH* zD+^&mSC=a*2hH4MHY~;UK*%AW@wqym91;>ilG&gIKXMxenzP9ADL~xM)iwd((JxEp8^UjN{}6^AV)dY6civI zxcXQu0`gvDHZ;Y(fB;Hy%6u3q`Q)J|$yMhA{*U4s0hdS7Hi&;fKGuL^qs#~V3uQi8 zAg3WK7p!Je)CH*kP4uhl2W}aPd?+-8ttHDR2LWlxYzmOSBC}y3sO@SS8Ue)>WH!oj zQQ(#)ZwCrI?~#`QW&T)D5zLnKLr#NWs%t^=Vv4s&lGi0fH41L$6_I6J-JODf&#BD1FJyc z1AvR6nA^d(N1(7})v+jq927UMvY}-mWjDDE4Fv+L^Pv?e`~n(~w#oB>5l&eyN`a!E z!InnKd}ui+>{(SV_?|CC|DgcDfFhqP^brNJV`U+V0=Z2NEUA&(Fp#1h`uW?}qGS~) zt_KPL0?6kQXy{U$PY%MHk)2OYmf~8XzbQ{t^snStY1O$}cqaqy=YN#O|3n*1VM5)qyB2prRq9Pz5 zy%R#{EkL9O2uXe`>N)ql_doc)@n#Il2x0BDpY^P#%sJP7aYtW^Wnp|9(Oe=H)2$#mcuGjB>=|mW4MHljM2E*B<|DbvV-=CMNCc zSB?FX7x8{?t?joPSAufgNUBM0v3f)6qbGGv1jc2YV%{5mRM>DjY_FlQhR9>#qj|zM zcP*W?&pcZn4f-gD9U3L}5(F;4Y-?Xk9ZDI(->yv~wj6l3mP%IXX}=98krmHEF*pe@ zf&cZ==)AN62iN~UFZwxQ*8lSs)78-TSN`vN`?(%Z{NMNfdU!Z|-~V}2GyMN5IEdHS{JtqBrqA@olz_@~Pb6~2EK%Os?NQ;L zy~I%5@<3xTR*BDVZNrTCh6gqOcla>wgMDtTA7x)?At_=*XW;5 z=8kLn;@E04hk7o;>+UdVbXtD!aONaiHewPx2B3}TyE=%8 z8yyZOzn_Se*6fJ5{^trzf7>rC?fre-;QJWd#u$l~6~rH)bnooSq=Z%ZRl?S2%J3(5 zv(GUHn66X?(;9{)Azc|@1zqK#Yu-2hu~YD~Kb6K6XIVMSoRB)wm=wHj5yOwkD6mw( zQ=h9rN>Exc%uI7QHSSE{`39$f0y;kxW~dKMPOio#VTlet5`m1IppUHTV*w- z_b?F&bnlLYa+O?DcRaCouy-*RlzP8kPOaUz1HXSgy^`c;+nX4L3oJhv){~Fr559(0 zr+*4P*1p(h$;4DctZ!(Y_NuIs)6y-}i+F?Tp5{zh$}Q+j&`XjvRen>4!`c5C^vbip zPW_%CV$=~*)P>{_Uk#{R?|qn}j*MY~>`%Kf%-Js3c zIRRL0u7EX#C2e6Z)BZ3#g>C*GG#xSn7A=9X*&vqTkePpqyRp-Miof>;6bcV1uOO%W z=hn{;cWQo4JxNpvDNA2DA8J1zRky3}r@Hm@O_8-@HA;cd7Y|W3Tk$LjCC{V6TU@3& zPUTnB+Omcjb=`f%E&D$0;KMAj3m3AQ7MS!o;8>f=0}Y}gVrE9(5k;nju_q_xd^~b8 zGS3BJRg%APh+EHKDo3Cg4spvI0o$N#!p#0ZHrM#4fZh23>{ml0#2kxJcy~K+H66f1C-u+kf}PK)UJ3|4?xI_+w5$LHJR+pKT? zZAmgV1J)sb-#o_0`@zyp9xLyL&o5RcyO932W{-cJpYhSMO+w=15Z_T|W)rY0rMnGf z#a-WDjOI)4an!&2AlUk|i*!MDxrweBYM{k8seD!grPI>VH!@O?mv>mOvB~)!JtR>T z5j)eKS`fM3J1RNwYaT0(VrB3idA=4gV?CRi+h@;)M?~G-?EI1_=lfVvh2kS9_>CtK zU6rFB@us-HWh2RV>`rZgiwp3~_`d`Sr@r~?aVzNLw!JOlifd|illN0gPph<+l(4AO z2V1JLA2yn9npBKGT{@HnLZ$Eg_L@&~r8heV$3lV~rmR9#i2i|fnq5?g^Z^*f*w!Mu zJ6u#wRyMz{zu=8&QnG$R5l6(ib6;Qo#aY-pb)BVs5&f#ZR5oBln1w5OJ|S>#^R!of zUe;~JfM-f?&GPCSvPIE(h88#CMO|I`hKCFKO_SW6Epvi57|;H&zNW#oI%+)1$Jqr|H#jX4ey7A5+H@@`!; z&23SU!aGLZrC?qUGBVenkb-R#g4b3h+py3z_|k~C28C^!c+29MsAysDNOys0lFj{e z(;H92M9!+|Gp_5T5!x8divrgbR+Up^BKMZo9P--O4lf_BMqpNtMwnO)oY4mR$Heej z^~ETje8nCi=b6_>mQbb-dP60YH)?PKmt^5htA04VskwCqaF2>gcN_{4I^zn3_JQqs zEGp#+JWE=pnDV4TV)yfe!2|W^6U^9$4hmiHzqzFJ)g85|ue`V|SCa zC$f1`_WqBiRyP78J?*a*{u(CmENg-$i~OpNeV#VbKJ4z zK7D~pgWek3VG(q4rAu6&%iYbFUjm5@Lcyt}`S}*E3a|l?AIr%MzZEsUYf}7%BZ*f( z?CUj6c21tm$n{fErxUZaCvPs4am87Pq=nAvuld?;{wiOtx^iZEZxQECCHo8rZL)yvr&{X{XG?R{al zzFJx*FAE=e*ZB1vqk4g;wabVtey`4ZQD~uabM?L(Osl^xUm??R7I+p^YIs{mcsllG7(7W4?1Q zW?2q+$IK_Z$)G^#e}kmZHmfA)Jl<_GT;`6cSRBmUSGnz_WMh*!>;L57dQbLj$AC#f zYVN1Ow~`$_@dJ8Vn|v)jAf1)_&f2=;$_-ee^lVT|Z}Yw%Zc}#;t))gq$T@uLgmw&^ ze1jf+pSHW(rQc-n!=Ywl;$+ooBlh7`(@!*g8^f)xgl+Zp-9Xs+U0GjEA}t8OYS25O zH_?>fSsqRz9ZIcD-^N2RZ%~$KKegT4VB5vl-(U1zysmkWrPIRkNAG}?(rjyBP33q6 zrQDN<(x0u~-YN8LWf=(GCQB%9&XJ|6Gr&*yhXJ#X2R8>&6l`)Uy@S;qeH4Ugqe8mt zpC>!EgRLRdg;6U;hGL08#N@m}NmU3V8c#gPK#bPOFjj>; z*|7lzcs;54h~m!?aYN=}OpirwM=v{PN}t2o*HRs>e#u+IsySi=nW(vipccYfF`-)1 zV0V6+RUK!SalU*fb3i$T^0qvwf(RiYGNxVSxoN40$Uc{@Au!Aaq55`asX=tdZwST1 zB{TCk8ZuW3(vX9{n~!l8w5B8oUn3*ay^#X5E~=~RQcIrJ4zMo=o2~#Pf%$$<1F&y0 zaU|r3M?$9^NL-_IpBM5_T=pz7s4nyvcW1FuxNgfjZkC1;6dUjjV9o8xW_ML^ZWz`} zR1Qj9#!iN*_Me3l(p9L7=|b^VcOLo+fZ7CEL(Z=?FV&V(rM^7K^8N6=XMdLhKftei zCV}g@I|YLZ1&N%rDK^JibkH|x@Q$u!YM9TgRN9@cSBCX+ygwv5+sL~*m=TXCXf4ep zZGjZ-r?zvujz(oxJ5&o2Kao*`5$K!jRn)M7o!}9O%1+UEgW%0+0ksVE-3{NG-F|G< zMk9=vlmM9<*v2j}EQnDzpBhl)NlGlg*Cj1sUUr-Fhq@kl6mFOH>F8B0wM$z=vhLmH zQK#QzgMfSb=vgGFaxsr5Z0?%g&9*|Ws%7HswLcdeXPpQ`p-Af!DTw+Che3-3fQ>BxTtYAjooeJ7xG4?C=F^x&7 zdP}JP{$0StxX)>J^hszF#AlOBN9Q?d)3c7c5n(UQMSs=2D#wFG{FpT9>s@G64w?79 zwd;RFOUIX5Ox`~AQmrdj*L0ppz-C&VT{$UXRRhXs8lPp&{X)Xbx&v^Hs?8>K(yEg$>azIO%} z)5IA`tuw{YBLjAcMhG2amK!Ty`G97J$`(jK|tX+saK5MFV-I@j} zgkmsD7Z>72=gAHt>QLGpUDsJN)ZF(kBYSNECQ|}w)0is06Ivwyt!8y18ciR7on0BA zucf25yYTp7$y2EX>*$i>NR0nR5> zQ0&?nL!w5h*?epvc?*<2Sh`4EuWbU=U3YOnL4oVW&p70!k%{W$J=?#bSkh&XuB%kE z?%#c}9JJ0Bn+2*2UmPN)Fh40~ni;)orA+3X>*pO6x2pL9+e1NFZn z9YD{kh1jk*r(s5^*BRTq&KI`+ZQNsyGe#|QBgt?7 zLmHzmV^#Ns*n(}VMt@KYnOQhupi1w6hE)kVPgk|=IIVD2t$iVEXP+$WneRr2(63=; zSEicPYZeCvh$M)=Nf>reRrMvKg^1xmcH!*TpNAK?;OT<`f!Jd)jM(lp5-BnXYMk@) zb9=M2?X=3P~c( ztJ7u1B0h5`C2sUUR2C8%Qh(cy_D?gvu1#T`;^t3Mb^DMGExsAWtM=`uIfzp=hoT0C zkB7zF7d}s;ez{i9$BJ1VDI}w=1tp((<7%9-5+tlsaU{D2rk7B^vbu|W@k~Y~1 zguo6dZ{M8SbZtgEX&FHM1>ihVYy7um_cydQ1etjA<<$^X_~sqLjF_6?MZMFL1Kbzp zceiuryVqu&r#^@`G~6uwSY>JjU+l6XAqv`XemsfwAWUr6)n(VIQwmv|hWJ=je{OlJ z9qqJl0GZOczJgw7WlGsJ#Al%@Y& zX|9N>poK(n@$g5xm^UbE*j6oS=#w7Utvb5eN1ak=4)RH|-Yxs}(q#3IcwQDQEb7m& zFD$3uNlmA!s)jYcIHH84@!GR2N+q-w7h^XUdIZI$ec-fU-EOP^5>=8JO7T(%j(roG z<%+Jpp4*X}0iXoM6n%D%ggMZ$t`8&C4NJd3i9~t#fzV=QIroV{$-evLmaq6~W zdLKo~-q5-!sw8?1fB`P<<@u=eyI~Q!4Cm$h-mIVk)3$iVAq6R9?og}I^*x|tsT0fi zx6Uu`KNO+tm0eK3xwsCqg~V2=In!cQ`;}F>oh=!HGCnuw0_deyT3Qy- zPq)m${g@2&7rvi+E)T`$PsFY=851oYUjHRC2P0cmc0xF z?_^`-;d}9j`a{eK3lVaSoF2Rs-b7jGYk(TqW0YUVAs|>yY4K$M5g=B+4LSr75{$;d{RxFQN?}TuUw}i1NkCbHk-uT*~M2!;m=v-Z_{`RaH3) z&Sc-J_B3Rot^c$;b)E#{PK4F>r?fPSEna%KK_aV}*xoNm-#+)Xe!WxxX<%mU^X7*G zfO`}|D8yj^7W%2LKV4-LnD;@&s9U8I(Y3dM5xJBqq$VC@Zgh4)yM`QpxOh{_&1^Gl zg4h-q)gHS^UrPI=S7jNfBvTY%(FE6HughLvP^ErXWpoK`OJMAyzk0N;2ZSdn3CYT4 zY$LvLU<$R>OJH?789j;wo$RW}8?8TkiMwne9JI^xFAk9mZ~GW0;j|pt`sNnXt0pGA z*MF5Mv|TcV8F4f`_o9%?sB@1Ty^IUM#YIINcIts`62jf3QndvgW<@fxw4Q9SAXQ-~ zaOt3h>uu_YP|z=#0QfR=(-kxeI05lqZzQ7r4w(XK`N9g6t!K$)e9@pf2=O46|6s)O zllYkmfLG}8DtC9DOFIzpX2btZ@V5_Ns@8Y66BX^N4dYI-b8)4E`eb^B(|NMOX2)$i z5l*?ntFOAf+p5OrI5&11A2^Be?Vp!X;`O_P(O`6Z!0S0WUoI;;ws%XBs<8xcU<#J6?jDJ#xIAO5QF(C#njLDmP*y29Fg9zut`(c;DUGVZbXNo5vbwI zr77gKkl-+*R#WtV6)7acH_6YRR|eGDO8{0+!pXZ0<&*HVN9juXBb@Fxv~DZ_L>t7d z9G9P;Gb40CQ5+uq+IgV%iDd~qPCnoYruJtG4pcaod?G>NO^|n^2J359L>2|+?!u^T)m_N|_mAjzCNbj5no%*W~^jW3u zt>p5J#%sbiRXt|)maa`SF&{b7r8i)cjfVJCj4KjsKYS<3gCuGO3cBRS|;E!u{i1#+PtYd#TlrGm!AGA0Y`Zh)pbm!EaSxm)|7XmV_3s2!+Zos^lSqnjm z%(dgW{IHXW;r5IP@^nh+Ap<57jHj#LvzVf0~=p#Pqk;lE_C zGZX{sTyLIyotP|K*pLB$40|@dtX3%%T?2Kmm{Izy!0_Tg=U^7-Q-5?`J~8oNV__|3 zeo@AuoEfYkV)aVR?jo9^i z=TJWgX^Ls@pXN-pcUp8;ukQD_71FmOVT*AesnWfn73phH3mRPKiNRMX(YiX%2c!T_ zldi}qVYNj5`BRTjqceH4oJ~gn);P5ix(^cO#Z0|x8nd+=pCBAGcLt4%1JJeC;Jx}a z3vmv-UlW`1u}xe5HzlkC`nkH52f|ncaCul*r7b8jQlfce&u9+V2E0gDOC4OSu!xIrT~BYiFkS04=1RO{5#se3| z(*msBx?ch0SQSoRs#=X*3m!(xXuntq+)5P)VtJ;c#c6MYnBNYa@vgLPk(hhufrJIn zdjvxpvo8qFby1))>xnxw47xLKTvLf})~iw}*x>bR698E*g)YQpVWBkCc!Oe7(&d0^ z8f8@`JbIN*)-UsGED{OaR3M8gUasva5lT%16-Xw)b~T#<#!yMeY*K_>?;)vE6oRa* zqd`jj?{pgZ4==dGi^VVZY7M4 zSq4Cla&_?StNLRd0Uv|)eQxNvrvS{*)oIb!4C3NnVsq=(b5!N-UQN4%;^8r0+^kqX zS4iD$esxR2c4)}~%~W#WvPj{E9Tmm0tDbHT^cJ!lhM?{Ak-!3az-lVY_?d+s75K}-29E-v)Pph63_ACM8 z0TS)Nr`XRC5mA?AQAHVK67gdn&#^jaN0M^xV2esZDO_0^HEUlk53{f2*Mh;&Ecj?$30rceRfnxE8LXMug_vX9(5nCu%fU|i=aQ;60~P8K;;`DG&9}c zSYB2~jFR14C)*7sow*6oyt7L&7Ro+S= zsovX7Ml6pE*{JECImX**kbh7;Nt~nfYkhsM+nCH}!fZwD!`nh<; zrh97Dp{IftTk6Of(dmlk&VThCj%-%z3mf$Wm@&4R(ZUqC5|2g61;&R>a;YLuZ>Rn? za_7l^aUUV+bJhSSW(}+F+zMrSdXXnFvctKI*~9VRX2;KfG(c$7$C%h4D9~Jcc9V~| zDAjhd>gFLv#T$(3dJdTHy0&;#pAX|Dn(43mB*`vF;% zUcymY62Vzjb)0+iVdtZeJ-DWXmZ*XPl$?s1M1#4)kM?iZd5BO%^~;xk-MA47D)IE5l5*dr z`a%=AMv~(h6Vrh$cN9o8^>)SmV>>J6*4B~fp6Lfke#Q?!6&=)^t`Qu*XSOwY;|4db zY22qEy(GWhj$tA|Aa#dDB={A2S4ageTuH5_tjq=iVZtS!_LZTcj7HEb2Z zo%U)bZS1bOlT*T`%&>b!etsq1y|`winEZBzgWh+Xi|f9Bk;sK_u%;zu&?~V-F<<7Q zl-#@ibfxJd=4seAt@g!Xh0*lNu6oMM+wN5z8$Tk#@`IBFIusN17`N1KgrtRqY~$27 z&2LppZ2@3ia}tzYiOAt&nsyzEc5iMUBs>uiw{~$#M#r9*z?LH;zIemkB9W+1;|6zM zHMYk!okpM@TfqD_)pN45`#Rl6w}?kC9WMIz^b>T6kN`pdzOS0UO;7*&?Y#lFqPQ*( zW?5PmPXD4(iHAPh5?RNR@1e#^0Qts!S-e%M$-UfHVz9g0Mf~}M?xU()uBAcOr+s0| z=$rd`gqS0S7H7A2Xp=qh8s?>JKBpxgtC4i?gsO6pD-MB7YG*yf4Pox4oNeeLD zy~~KOV^Msd)CLW^jEY~NOKf_fLlM4Ak3GImM}M`L8Sa>Y^*9i5Z`?J^BTnQK$ElPb zy!wu?1P`w@&#Q6uUCJ?jw~dX_yaR+l;>zzf!~T2N#Bw+(&D*j4noQ zww5FmnY;SCIozyzV7GhHDIeyv8Xy4Rw8k6`O*>UzxZH3Q?_41_Hv&nJmFL1Wxyc8x zgx7(#DEIa)6(RH;sp?fYETL=G9YQkK`g{Fqzgb4o>x{Iy)qeojBT@T~uYPh~JSA-=;5`N0xPE^K|Swu{A)F?1pF%{>1H&tCr(UUJ~h=Wq-(k*C~OPpc~Bzo@T+^bY(WLq-agKFkVY1f z6j2z7!WrFBIjhz`YquAn4KP+l8PY*Hef%Wf^oq^_&~T!uvzqlyJgN`!7*!@iwLqhG zpJG3LHX<8ACi6hCgrD)J0=vavQZL&a6lCQkS(Fgmfe5XjLZ(CP1Cl}#iEusaWs6Rax{@1M|o(MAb+HK=3 zunQMA+v}Lw6JBp<+(0C}YMS#0DCM4l3m+yZKZ)$WT~XoRn>Pd@urcys*ae4HK0GR0O+%C;cWyAT(^1nvCO5Mf>{F>OC?x%ye@7(N!~ zh)Dmit)51Sx)?(?7=V9n<+r&YD1bC^ z2bQp!O`{sJ#NewDG3{yULXY~+bV_$esS;#zA_pL8Ay+>4mDx+b>y?tbao|RZ6w`2R zP#s9hKV~+Wef1sAy41aIj&l%b$FiHq6%?o`KAEUZPUEPf#&#RU?K5ZjWgrcb%)esp zQS0Dk>v}sPpl+J#YPyOi`P2*x%tmRSe{M07IyUwGd7-aC`94G z;62UB&Bcya_2b9fF7ux*MwjGU0799?NdmI)_w4lB@6SCnCxK9orK8f>ih!+pc6UC% zIXuE0$6iE&*5u&^IX#xxwox)}CEM zA1!x-zEl8W(#hY=S1(14@&IV1o=D47!$s3B!JnopGBBbt+4oJS;$LsC*OI4Ixy z+r7WKbvLvLurJs>MN?wq_tQh0D$3xbqs(1huR>CY0;A&SA0omBs$6}%FiG&JhGF~+2kR=(M<)f3gOeq~KVUuDHqfqlc0vd0DpqIe_%38|Cx`g4%w+ zJqD@4GyxYE=)8|LZTvya&gNcCZ%c6wAbx9~tZ;b(Sg!i}PRSP-SS1%fE^e