Tidy-up CUDA implementation of IProductWRTBase and add CUDA kernels with additional parallelism (!73) · Merge requests · Nektar / redesign-prototype

This MR tidy-up the previous implementation of the CUDA version of the IProductWRTBase operator and introduces new CUDA kernels with additional parallelism across quadrature points.

Edited Jan 26, 2024 by Jacques Xing

Tidy-up CUDA implementation of IProductWRTBase and add CUDA kernels with additional parallelism

Merge request reports