Nektar++ cannot yet be used with GPU co-processors. This MR enables the evaluation of these operators on NVIDIA GPUs.
This MR incorporates a CUDA implementation of the key operator kernels: Helmholtz, BwdTrans, PhysDeriv and IProduct into the Collections library.
Detail on implementation
The kernels are integrated with the existing factory-pattern in Collections which decouples the code from the rest of the library kernels and other Nektar++ libraries. This enables it to be easily enabled using a CMake option, compiling the CUDA modules only if requested, and avoiding any direct dependency on them elsewhere in the code.
The kernels support two CUDA vector-widths: 1 and 32.
- In order to install CUDA operators at this stage, the Scotch library should be installed with double-precision support by including the flag
-IDXSIZE64in the file
/cmake/ThirdPartyScotch.cmake. Therefore, it is recommended to request that Nektar++ compile Scotch, rather than using a system-installed version.
- If METIS is used, the
#define REALTYPEWIDTH 32should be manually changed to
#define REALTYPEWIDTH 64in the file
/build/ThirdParty/metis-5.1.0/include/metis.hfor double precision.