Skip to content

Add CUDA backend implementation for AddTraceIntegral

Issue/feature addressed

The AddTraceIntegral operator was re-factored as general implementation AddTraceIntegralImpl.hpp. For the CUDA backend, this introduces repeated device/host and host/device copies. The same problem will occur with a Kokkos-CUDA backend.

Proposed solution

  • Full implement the locTraceToTraceMap in AddTraceIntegralSerialStdMat.hpp
  • Add a specialized CUDA backend AddTraceIntegralCUDASumFac.cuh and a specialized Kokkos backend AddTraceIntegralKokkosStdMat.hpp.

Implementation

Tests

  • The existing CUDA test is used to test the proposed new implementation
  • The existing Kokkos test is now disable as a Kokkos IProductWRTBase operator backend is not yet implemented

Suggested reviewers

Please suggest any people who would be appropriate to review your code.

Notes

Please add any other information that could be useful for reviewers.

Checklist

  • Functions and classes, or changes to them, are documented.
  • [ ] User guide/documentation is updated.
  • [ ] Changelog is updated.
  • Suitable tests added for new functionality.
  • Contributed code is correctly formatted. (See the contributing guidelines).
  • License added to any new files.
  • No extraneous files have been added (e.g. compiler output or test data files).
Edited by Jacques Xing

Merge request reports