Implement CUDA IProductWRTDerivBase sum-factorization kernels
Merged
requested to merge CFD-Xing/redesign-prototypes:iproductderivbase_sum_fac_cuda_kernels into master
CUDA kernels for IProductWRTDerivBase
operation have been implemented for all element types (seg, quad, tri, hex, tet, prism, and pyr). All implementations are based on the SIMD-based matrix-free version.
To avoid multiple copy from CPU to GPU, geometric and derivative data are copied to the GPU using the constructor.
Edited by Jacques Xing
Merge request reports
Activity
Please register or sign in to reply