Draft: Refactored IPWRTBaseSerialSumFac with cached quadmetrics
Issue/feature addressed
The current implementation of IPWRTBaseSerialSumFac does not perform quadrature metric (weights + jacobian) caching leading to 30-50% slower runtime performance compared to legacy's implementation with caching.
Proposed solution
Allocate a pointer which caches quadrature metric (weight+jacobian) in an initial compute (i.e first time-step) and which is used to perform a single Vmul with and an input vector later.
Implementation
- Performs kernel functions
MultiplyByQuadMetricSeg/Tri/Quad/Tet/Pyr/Prism/HexKernel
(inIProductWRTBaseSerialSumFac.hpp:118,163,207,261,315,370,423,
) that caches the corresponding quadrature metric into an array QuadMetric (IProductWRTBaseSerialSumFac.hpp:474
) which is accessed by the pointer quadmetricptr (IProductWRTBaseSerialSumFac.hpp:476
) - When the quadrature metric is cached,
MetricCount = True
. Next, we perform a vector multiplication (vmul
) between the input vector and the cached metric and stores the results intowsp
(IProductWRTBaseSerialSumFac.hpp:123,169,212,268,321,376,429
), which will be used to perform the inner product with bases.
Tests
Passed all tests with test_ipwrtbase_serial_sumfac.cpp
Suggested reviewers
Jacques
Checklist
-
Suitable tests added for new functionality. -
Contributed code is correctly formatted. (See the contributing guidelines). -
License added to any new files.