Basic implementation of Matrix-free BwdTrans and padding feature in Field
This matrix-free BwdTrans implementation supports all shape types, any number of elements, and SIMD.
A new padding feature is also added to the Field
class.
- First,
GetBlockAttrb()
computesnum_padding_elements
according to the vector width and theblock_size
will include paddings as well. - Then
Field::create()
allocatesm_stroage
based onblock_size
, which is large enough for later use. Align is also set here. - By default, the newly created
Field
is marked asm_curVecWidth = 1
, which means no interleave. There will be unused memory (num_padding_elements * num_pts
) at the end of each block. Make sure you don't access them accidentally. For example, inStdMat
implementation, to move to the next block, we should useinptr += block_size
instead ofinptr += num_pts * num_elements
. -
ReshapeStorage()
will change the storage layout to other widths, but not great than the initial width. Usually, we only need to transform betweenscalar
andvec_t::width
.
The simple CTest case (requiring padding) works.
Edited by BOYANG XIA