Implement CUDA PhysDeriv sum-factorization kernels (!13) · Merge requests · Nektar / redesign-prototype

CUDA kernels for PhysDeriv operation have been implemented for all element types (seg, quad, tri, hex, tet, prism, and pyr). All implementations are based on the SIMD-based matrix-free version.

To avoid multiple copy from CPU to GPU, derivative data are copied to the GPU using the constructor.

Edited Jul 13, 2023 by Jacques Xing

Implement CUDA PhysDeriv sum-factorization kernels

Merge request reports