Implement CUDA IProductWRTDerivBase sum-factorization kernels
Merged
requested to merge CFD-Xing/redesign-prototypes:iproductderivbase_sum_fac_cuda_kernels into master
1 unresolved thread
CUDA kernels for IProductWRTDerivBase
operation have been implemented for all element types (seg, quad, tri, hex, tet, prism, and pyr). All implementations are based on the SIMD-based matrix-free version.
To avoid multiple copy from CPU to GPU, geometric and derivative data are copied to the GPU using the constructor.
Edited by Jacques Xing
Merge request reports
Activity
Filter activity
added 33 commits
-
4a9be4b2...04b93e55 - 6 commits from branch
nektar:master
- 04b93e55...285e6285 - 17 earlier commits
- c9f19d22 - Merge remote-tracking branch 'upstream/master' into iproduct_sum_fac_cuda_kernels
- 90b83a90 - Some fixes
- 86136594 - Fix merge error
- 130b12fd - Merge remote-tracking branch 'upstream/master' into iproduct_sum_fac_cuda_kernels
- fb936a54 - Fix cuda bug
- caaf8533 - Add CUDA test
- 5e66561e - Include forgotten file
- 00617dca - Merge branch 'iproduct_sum_fac_cuda_kernels' into physderiv_sum_fac_cuda_kernels
- 5d0debfa - Merge branch 'physderiv_sum_fac_cuda_kernels' into iproductderivbase_sum_fac_cuda_kernels
- de88a9ba - Preliminary implementation of IProductWRTDerivBaseStdMat
Toggle commit list-
4a9be4b2...04b93e55 - 6 commits from branch
added 1 commit
- 7924c85f - Extend PhysDeriv to deform mesh and redesign deallocator
added 3 commits
-
8cefb1f6...554abc7b - 2 commits from branch
nektar:master
- 2760a399 - Merge remote-tracking branch 'upstream/master' into iproductderivbase_sum_fac_cuda_kernels
-
8cefb1f6...554abc7b - 2 commits from branch
assigned to @ccantwel
@ccantwel I think that this is ready to be merged but PhysDeriv should be merged first.
added 4 commits
Toggle commit listadded 3 commits
-
9a7b9d3f...d05e90c5 - 2 commits from branch
nektar:master
- 32cd9db5 - Merge remote-tracking branch 'upstream/master' into iproductderivbase_sum_fac_cuda_kernels
-
9a7b9d3f...d05e90c5 - 2 commits from branch
@CFD-Xing I will merge this now, but we should really try and add some unit tests for the operators.
mentioned in commit 4360038b
19 19 } 20 20 21 21 virtual void apply(Field<TData, FieldState::Phys> &in, 22 Field<TData, FieldState::Coeff> &out) = 0; 22 Field<TData, FieldState::Coeff> &out, 23 const TData lambda = 1.0) = 0;
Please register or sign in to reply