In my implementation of sparse symv, I use the OpenMP reduction clause when computing the result of the lower triangle from the upper, since there are data races that can occur otherwise. However for very large calculations, the reduction clause causes segmentation faults. I think this might be due to OpenMP storing the reduction vectors on the stack, so the threads run out of stack memory if the reduction vector is too large. I have tried to increase the stack size for the threads (OMP_STACKSIZE,KMP_STACKSIZE), but it does not appear to help (which maybe means that the stack size is not the problem, but that's the only lead I have so far).
For now, the simplest solution will probably be to use mkl_?csrsymv instead, but according to MKL documentation those routines are deprecated in favor of mkl_sparse_?_mv (but it might be necessary to use a different storage format for these routines, need to double check).
In my implementation of sparse symv, I use the OpenMP reduction clause when computing the result of the lower triangle from the upper, since there are data races that can occur otherwise. However for very large calculations, the reduction clause causes segmentation faults. I think this might be due to OpenMP storing the reduction vectors on the stack, so the threads run out of stack memory if the reduction vector is too large. I have tried to increase the stack size for the threads (OMP_STACKSIZE,KMP_STACKSIZE), but it does not appear to help (which maybe means that the stack size is not the problem, but that's the only lead I have so far).
For now, the simplest solution will probably be to use mkl_?csrsymv instead, but according to MKL documentation those routines are deprecated in favor of mkl_sparse_?_mv (but it might be necessary to use a different storage format for these routines, need to double check).