Implementation in Collections library of CUDA operators for key operators.
Issue addressed
Nektar++ cannot yet be used with GPU co-processors. This MR enables the evaluation of these operators on NVIDIA GPUs.
Proposed solution
This MR incorporates a CUDA implementation of the key operator kernels: Helmholtz, BwdTrans, PhysDeriv and IProduct into the Collections library.
Detail on implementation
The kernels are integrated with the existing factory-pattern in Collections which decouples the code from the rest of the library kernels and other Nektar++ libraries. This enables it to be easily enabled using a CMake option, compiling the CUDA modules only if requested, and avoiding any direct dependency on them elsewhere in the code.
The kernels support two CUDA vector-widths: 1 and 32.
Notes
- In order to install CUDA operators at this stage, the Scotch library should be installed with double-precision support by including the flag
-IDXSIZE64
in the file/cmake/ThirdPartyScotch.cmake
. Therefore, it is recommended to request that Nektar++ compile Scotch, rather than using a system-installed version. - If METIS is used, the
#define REALTYPEWIDTH 32
should be manually changed to#define REALTYPEWIDTH 64
in the file/build/ThirdParty/metis-5.1.0/include/metis.h
for double precision.
Edited by Chris Cantwell