I can run problem 1 successfully on Frontier with < 64 nodes fine, but I get a segmentation fault with >= 64 nodes:
Running with these driver parameters:
Problem ID = 1
=============================================
Hypre init times:
=============================================
Hypre init:
wall clock time = 0.000006 seconds
Laplacian_27pt:
(Nx, Ny, Nz) = (1600, 1600, 1600)
(Px, Py, Pz) = (8, 8, 8)
srun: error: frontier04522: tasks 282-287: Segmentation fault
srun: Terminating StepId=2131722.0
with Segmentation fault errors reported for all of the other MPI ranks as well.
I built Hypre v2.31.0 with:
./configure --with-hip --with-gpu-arch=gfx90a --with-MPI-lib-dirs="${MPICH_DIR}/lib" --with-MPI-libs="mpi" --with-MPI-include="${MPICH_DIR}/include" --enable-mixedint
with cce/17.0.0, rocm/5.7.1, and cray-mpich/8.1.28.
I'm running the problem with:
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=closest
#SBATCH -N 64
export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
export MPICH_GPU_SUPPORT_ENABLED=1
srun ./amg -problem 1 -n 200 200 200 -P 8 8 8
I can run problem 1 successfully on Frontier with < 64 nodes fine, but I get a segmentation fault with >= 64 nodes:
with
Segmentation faulterrors reported for all of the other MPI ranks as well.I built Hypre v2.31.0 with:
with cce/17.0.0, rocm/5.7.1, and cray-mpich/8.1.28.
I'm running the problem with: