About this repository
This repository contains a basic benchmarking utility for the matrix-free finite element evaluation. It uses Nektar++ as a library simply to construct elements, mesh topology from a supplied XML file, and supplies a Helmholtz operator for triangles, quadrilaterals, hexahedra, prismatic and tetrahedral element types. Results of the paper below were generated from this code.
D. Moxey, R. Amici and R. M. Kirby, Efficient matrix-free high-order finite element evaluation for simplicial elements, SIAM J. Sci. Comput., 42 (3), pp. C97–C123, 2020
Compilation
Prerequisites
Nektar++ should be first compiled and installed into a location $NEKPP$
. The code also uses likwid
to provide hardware benchmarking (for e.g. flops and memory bandwidth recording for roofline models).
Configuration
This code can then be configured with CMake as
$ cmake -DNektar++_DIR=${NEKPP}/lib64/nektar++/cmake -DCMAKE_CXX_FLAGS="-mavx2 -mfma" ..
Note the use of -mavx2
to enable AVX2 instructions. AVX512 is also supported. This can then be compiled with
make -j$(nproc) install
Usage
The code is designed to be run with an input XML file that describes the geometry of interest. The main restriction is that the mesh should comprise only one element type only; i.e. hybrid quad-triangle meshes are not supported.
There are two executables:
-
benchmark_op
will benchmark an operator. -
solve
is a prototype solver using a conjugate gradient method and matrix-free Jacobi preconditioner.
Generating a mesh
A sample .geo
file for a quadrilateral mesh is included in the meshes
directory. Run the following commands to generate a mesh using the NekMesh
executable from Nektar++:
$ gmsh square-quads.geo
$ NekMesh square-quads.msh square-quads.xml
benchmark_op
This utility simply benchmarks one of five operators:
- backwards transformation (coefficient to physical space);
- inner product with respect to basis functions;
- derivative computation;
- Helmholtz operator;
- global Helmholtz operator (i.e. Helmholtz + assembly into C^0mesh).
It should be run as:
likwid-mpirun -hostfile ~/hostfile -nperdomain C:18 -m -g MEM_DP \
benchmark-op -v \ # verbose mode
-P order=6 \ # order of operator
-P Ntest=1 \ # number of evaluations of operator
-P deformed=0 \ # force the mesh to be not deformed
-P test=3 \ # which operator to select from list above
quad-10.xml # XML file
The output will contain likwid's benchmarking information, as well as some information on evaluation time, estimated GFLOPS/s, and throughput.
time = 4.445e-06, gflops = 2.74016, dof/s= 1.79978e+07
error= 4.16334e-17
solve
This takes as argument only the order of the simulation. However at present this seems to not be working following an update to the code to bring it in line with the main Nektar++ tree, so will endeavour to follow up on this in the near future.