Skip to content
Snippets Groups Projects

Implement CUDA IProductWRTDerivBase sum-factorization kernels

1 unresolved thread

CUDA kernels for IProductWRTDerivBase operation have been implemented for all element types (seg, quad, tri, hex, tet, prism, and pyr). All implementations are based on the SIMD-based matrix-free version.

To avoid multiple copy from CPU to GPU, geometric and derivative data are copied to the GPU using the constructor.

Edited by Jacques Xing

Merge request reports

Approval is optional

Merged by Chris CantwellChris Cantwell 1 year ago (Sep 12, 2023 9:11am UTC)

Merge details

  • Changes merged into master with 4360038b (commits were squashed).
  • Deleted the source branch.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
19 19 }
20 20
21 21 virtual void apply(Field<TData, FieldState::Phys> &in,
22 Field<TData, FieldState::Coeff> &out) = 0;
22 Field<TData, FieldState::Coeff> &out,
23 const TData lambda = 1.0) = 0;
Please register or sign in to reply
Loading