Merge branch 'bwd_trans_sum_fac_cuda_kernels' into 'master'
Implement CUDA BwdTrans sum-factorization kernels See merge request !11
No related branches found
Tags v4.3.4
Showing
- .gitignore 1 addition, 0 deletions.gitignore
- CMakeLists.txt 4 additions, 0 deletionsCMakeLists.txt
- MemoryRegionCUDA.hpp 1 addition, 1 deletionMemoryRegionCUDA.hpp
- Operators/BwdTrans/BwdTransCUDA.hpp 314 additions, 4 deletionsOperators/BwdTrans/BwdTransCUDA.hpp
- Operators/BwdTrans/BwdTransCUDAKernels.cuh 943 additions, 0 deletionsOperators/BwdTrans/BwdTransCUDAKernels.cuh
- Operators/BwdTrans/BwdTransStdMat.hpp 17 additions, 12 deletionsOperators/BwdTrans/BwdTransStdMat.hpp
- Operators/Operator.hpp 0 additions, 3 deletionsOperators/Operator.hpp
- Operators/OperatorHelper.cuh 54 additions, 0 deletionsOperators/OperatorHelper.cuh
- cube_all_elements.xml 37 additions, 0 deletionscube_all_elements.xml
- main.cpp 158 additions, 99 deletionsmain.cpp
- segment.xml 44 additions, 0 deletionssegment.xml
- square2.xml 0 additions, 41 deletionssquare2.xml
- square_all_elements.xml 63 additions, 0 deletionssquare_all_elements.xml
- tests/CMakeLists.txt 15 additions, 2 deletionstests/CMakeLists.txt
- tests/init_fields.hpp 29 additions, 13 deletionstests/init_fields.hpp
- tests/test_bwdtrans.cpp 9 additions, 7 deletionstests/test_bwdtrans.cpp
- tests/test_bwdtranscuda.cpp 70 additions, 0 deletionstests/test_bwdtranscuda.cpp
- tests/test_ipwrtbase.cpp 2 additions, 2 deletionstests/test_ipwrtbase.cpp
Loading
Please register or sign in to comment