Basic implementation of Matrix-free BwdTrans and padding feature in Field
This matrix-free BwdTrans implementation supports all shape types, any number of elements, and SIMD.
A new padding feature is also added to the Field
class.
- First,
GetBlockAttrb()
computesnum_padding_elements
according to the vector width and theblock_size
will include paddings as well. - Then
Field::create()
allocatesm_stroage
based onblock_size
, which is large enough for later use. Align is also set here. - By default, the newly created
Field
is marked asm_curVecWidth = 1
, which means no interleave. There will be unused memory (num_padding_elements * num_pts
) at the end of each block. Make sure you don't access them accidentally. For example, inStdMat
implementation, to move to the next block, we should useinptr += block_size
instead ofinptr += num_pts * num_elements
. -
ReshapeStorage()
will change the storage layout to other widths, but not great than the initial width. Usually, we only need to transform betweenscalar
andvec_t::width
.
The simple CTest case (requiring padding) works.
Edited by BOYANG XIA
Merge request reports
Activity
Filter activity
requested review from @ccantwel
assigned to @ccantwel
added 3 commits
-
bd70ace1...554abc7b - 2 commits from branch
nektar:master
- f8e561d0 - Merge branch 'master' into bwd_trans_matfree4
-
bd70ace1...554abc7b - 2 commits from branch
requested review from @CFD-Xing
added 5 commits
-
c930d6d6...d05e90c5 - 2 commits from branch
nektar:master
- c92e427d - Merge branch 'master' into bwd_trans_matfree4
- 9915531a - reinterpret casting works
- 61890e0c - move square.xml
Toggle commit list-
c930d6d6...d05e90c5 - 2 commits from branch
added 1 commit
- 2a986e9a - new design: reinterpret double to vectorType
added 1 commit
- 8a449c07 - Fix CUDA compilation error, test still failing
92 92 93 93 // Generate a blocks definition from the expansion list for each state 94 94 using vec_t = tinysimd::simd<double>; 95 #ifndef NEKTAR_USE_CUDA 95 96 auto blocks_in = 96 97 GetBlockAttributes(stateIn, fixt_explist, vec_t::width); 97 98 auto blocks_out = 98 99 GetBlockAttributes(stateOut, fixt_explist, vec_t::width); 100 #else
Please register or sign in to reply