Commit af00a637 authored by Spencer Sherwin's avatar Spencer Sherwin

Added the developers guide to the main distribution

parent 435cad35
......@@ -3,8 +3,3 @@
path = docs/tutorial
url =
ignore = all
[submodule "docs/developer-guide"]
branch = master
path = docs/developer-guide
url =
ignore = all
IF (EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/developer-guide/CMakeLists.txt)
Subproject commit b1461b45abb08c48397fe6d046249703c4f8f160
SET(DEVGUIDE ${CMAKE_BINARY_DIR}/docs/developers-guide)
${HTLATEX} ${DEVGUIDESRC}/developers-guide.tex
# If tex4ht successful, create img dir and copy images across
FILE(GLOB_RECURSE imgfiles "img/*.png" "img/*.jpg" "*/img/*.png" "*/img/*.jpg")
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
POST_BUILD COMMAND ${CMAKE_COMMAND} -E make_directory ${DEVGUIDE}/html/img)
FOREACH(img ${imgfiles})
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
${CMAKE_COMMAND} -E copy ${img} ${DEVGUIDE}/html/img)
FILE(GLOB_RECURSE pdffiles "*/img/*.pdf")
FOREACH(pdf ${pdffiles})
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
${CONVERT} ${pdf} ${DEVGUIDE}/html/img/${BASENAME}.png)
${PDFLATEX} --output-directory ${DEVGUIDE} ${DEVGUIDESRC}/developers-guide.tex
${PDFLATEX} --output-directory ${DEVGUIDE} ${DEVGUIDESRC}/developers-guide.tex
${PDFLATEX} --output-directory ${DEVGUIDE} ${DEVGUIDESRC}/developers-guide.tex
This is the developer guide for Nektar++.
This diff is collapsed.
\newcommand{\BM}[1]{\mbox{\boldmath $#1$}}
\newcommand{\BMB}[1]{\mbox{$\mathbb #1$}}
%Yu Pan's commands
% \usepackage{multicol}
% \setlength{\columnsep}{0.5cm}
%end of Yu Pan's commands
\title{A Programmer's Guide to Nektar++}
% Render pretty title page if not building HTML
\huge{Nektar++ Developer's Guide}
\part{Building-Blocks of Our Framework (Inside the Library)} \label{part:library}
\part{Solvers} \label{part:solvers}
\part{Utilities} \label{part:utilities}
\part{NekPy: Python interface to \nek{}} \label{part:nekpy}
\ No newline at end of file
\ No newline at end of file
This diff is collapsed.
\section{The Fundamental Algorithms within Collections}
As stated in the introduction, this section of this guide is structured in question-answer form. This is not meant to capture every possible
question asked of us on the {\nek} users list; however, this set of (ever-growing) questions are meant to capture the ``big ideas'' that developers
want to know about how Collections work and how they can be used.
In this section, we will through question and answer format try to cover the following basic algorithmic concepts that are found within
the Collections part of the library:
\item xx
With the big ideas in place, let us now start into our questions.
\section{The Fundamental Data Structures within Collections}
\ No newline at end of file
\section{The Fundamentals Behind Collections}
The concept of ``collections'' is that there are many operations that we (conceptually) accomplish on an element-by-element basis that can be aggregated together into a single operation. The
basic idea is that whenever you see a piece of code that consists of a loop in which, in the body of the loop, you are accomplishing the same operation (e.g. function evaluation) based upon
an input variable tied to the loop index -- you may be able to make the operation a ``collection''. For instance, consider if you were given the following snippet of code in Matlab notation:
do j = 1:N,
y(j) = dot(a,x(:,j))
\noindent where $y$ is a row vector of length $N$, $x$ is an $M \times N$ matrix and $a$ is a row vector of length $M$. Note that in this code snipped, the
vector $a$ remains constant; only the vector $x$ involved in the dot product is changing based upon the index (i.e. selecting a column of the array).
From linear algebra, you might recognize this as the way we accomplish the product of a row vector with a matrix -- we march through the matrix accomplishing
dot products. Although these two things are equivalent mathematically, they are not equivalent computationally {\em from the perspective of computational efficiency}. Most linear algebra operations within {\nek} are done using BLAS -- Basic Linear Algebra Subroutines -- a collection of specialized routines
for accomplishing Level 1 (single loop), Level 2 (double nested loop) and Level 3 (triple nested loop) operations. A dot product is an example of a Level 1
BLAS operation; A matrix-vector multiplication is an example of a Level 2 BLAS operation; and finally a matrix-matrix multiplication is an example of a Level 3
BLAS operation. The general rule of thumb is that as you move up the levels, the operations can be made more efficient (using various algorithmic
reorganization and memory layout strategies). Hence when you see a loop over Level 1 BLAS calls (like in the above piece of code), you should always
ask yourself if this could be transformed into a Level 2 BLAS call. Since the vector $a$ is not changing, this piece of code should be replaced with an
appropriate vector-matrix multiply.
As seen in StdRegions and LocalRegions, we often conceptually thing of many of our basic building block operations as being elemental. We have
elemental basis matrices, local stiffness matrices, etc. Although it is correct to implement operations as an iterator over elements in which you accomplish
an action like evaluation (done by taking a basis evaluation matrix times a coefficient vector), this operation can be made more efficient by ganging all these
elemental operations together into a {\em collection}.
As mentioned earlier, one of the caveats of this approach is that you must assume some level of consistency of the operation. For instance, in the case of physical evaluations, you must assume that a collection consists of elements that are all of the same time, have the same basis number and ordering, and are evaluated at the same set of points -- otherwise the operation cannot be expressed as a single (basis) matrix multiplied by a matrix whose columns consist of the modes
of all the elements who have joined the collective operation.
As will be seen in the data structure and algorithms sections of this discussion, these assumptions lead us to two fundamental types of collections: collections that live at the StdRegions level (i.e. collection operations that act on a group of elements as represented in reference space) and collections that live at the LocalRegions level (i.e. collection operations that act on a group of elements as represented in world space).
\ No newline at end of file
\section{The Fundamental Algorithms within FieldUtils}
\ No newline at end of file
\section{The Fundamental Data Structures within FieldUtils}
\ No newline at end of file
\section{The Fundamentals Behind FieldUtils}
\ No newline at end of file
\section{The Fundamental Algorithms within GlobalMapping}
\ No newline at end of file
\section{The Fundamental Data Structures within GlobalMapping}
\ No newline at end of file
\section{The Fundamentals Behind GlobalMapping}
Based upon the appendix in \cite{CantwellYKPS14}, we
outline a rigorous derivation of the Laplace-Beltrami operator.
We use the convention that
indices appearing once in the upper position and once in the lower position are
considered dummy indices and are implicitly summed over their range, while
non-repeated indices are considered free to take any value. Derivatives are also
denoted using the lower-index comma notation, for example $g_{ij,k}$.
{\em Covariant} vectors such as the gradient are those which, under a change of
coordinate system, change under the same transformation in order to maintain
coordinate system invariance. In contrast, {\em contravariant} vectors, such as
velocity, should remain fixed under a coordinate transformation requiring that
their components change under the inverse of the transformation to maintain
With this in mind, we now construct the fundamental differential operators we
require for a 2-dimensional manifold embedded in a 3-dimensional space.
In order to express these operators in curvilinear coordinates we start by
assuming that we have a smooth surface parametrization given by
% \label{A:parametrization}
Next we will define the Jacobian of $\bm{x}$ as the tensor
J_i^j = \frac{\partial x^j}{\partial \xi^i}
% \label{A:jacobian}
where $J_i^j$ can be viewed as a covariant surface vector (by fixing the upper
index) or as a contravariant space vector (by fixing the lower index). The
surface metric tensor $g_{ij}$ can be defined in terms of the $J_i^j$ as
g_{ij} = \sum_{k=1}^3 J_i^k J_j^k.
which can be considered to transform a contravariant quantity to a covariant
quantity. Similarly the conjugate tensor $g^{ij}$, which does the reverse
transformation, is given by
g^{11} = g_{22}/g,\quad g^{12} = g^{21} = - g_{12}/g,\quad g^{22} = g_{11}/g,
where $g$ is the determinant of $g_{ij}$. The metric tensor and its conjugate
satisfy the condition
\delta_i^j = g_{ik}g^{jk} =
1,\; &\mbox{if } i = j,\\
0,\; &\mbox{if } i\ne j.
% %
To construct the divergence operator we will also need the derivative of
$g$ with respect to components of the metric, $g_{ij}$. We know that $\bm{g}$
is invertible and from linear algebra we have that the inverse of the metric
\eqref{A:conj_tensor} satisfies
$\bm{g}^{-1} = \frac{1}{g}\tilde{\bm{g}}^{\top}$, where $\bm{\tilde{g}}$ is the
cofactor matrix of $\bm{g}$. Therefore $\bm{\tilde{g}}^{\top} = g\BM{g}^{-1}$,
or in components $\tilde{g}^{ij} = g(\BM{g}^{-1})_{ji}$. Using Jacobi's formula
for the derivative of a matrix determinant with respect its entries, and since
$\bm{g}$ is invertible, the derivative of the metric determinant is
\frac{\partial g}{\partial g_{ij}} = \mathrm{tr}\left(\tilde{\bm{g}}^{\top}\frac{\partial \bm{g}}{\partial g_{ij}}\right)
= \tilde{g}^{ij} = g(\BM{g}^{-1})_{ji} = gg^{ij}.
\subsection{Divergence operator}
The partial derivative of a tensor with respect to a manifold coordinate system
is itself not a tensor. In order to obtain a tensor, one has to use
\emph{covariant} derivative, defined below.
% %
The covariant derivative of a contravariant vector is given by
\nabla_k a^i = a^i_{,k} + a^j \Gamma_{jk}^i.
where $\Gamma_{jk}^i$ are \emph{Christoffel Symbols of the second kind}.
% %
The \emph{Christoffel symbols of the first kind} are defined by
\Gamma_{ijk} = \frac{1}{2}\left[g_{kj,i} + g_{ik,j} - g_{ij,k}\right].
% \label{A:CS1}
Here we note that $\Gamma_{ijk}$ is symmetric in the first two indices. To
obtain the Christoffel symbols of the second kind we formally raise the
last index using the conjugate tensor,
\Gamma_{ij}^l = \Gamma_{ijk}g^{kl}
which retains the symmetry in the lower two indices. We can now express the
derivatives of the metric tensor in terms of the Christoffel symbols as
g_{ij,k} = \Gamma_{ikj} + \Gamma_{jki}
= g_{lj}\Gamma_{ik}^l + g_{li}\Gamma_{jk}^l.
% \label{A:metric_through_CS}
We now define the divergence operator on the manifold,
$\nabla\cdot \bm{X} = \nabla_k X^k$. Consider first the derivative of the
determinant of the metric tensor $g$ with respect to
the components of some local coordinates system $\xi^1,\xi^2$. We apply the
chain rule, making use of the derivative of the metric tensor with respect to
components of the metric \eqref{A:g_deriv} and the relationship \eqref{A:CS2},
to get
\frac{\partial g}{\partial \xi^k} = \frac{\partial g}{\partial g_{ij}}\frac{\partial g_{ij}}{\partial \xi^k} =
gg^{ij}g_{ij,k} = gg^{ij}(\Gamma_{ikj} + \Gamma_{jki}) = g(\Gamma_{ik}^i + \Gamma_{jk}^j) = 2g\Gamma_{ik}^i.
We can therefore express the Christoffel symbol $\Gamma_{ik}^i$ in terms of this
derivative as
\Gamma_{ik}^i = \frac{1}{2g}\frac{\partial g}{\partial \xi^k} = \frac{1}{\sqrt{g}}\frac{\partial\sqrt{g}}{\partial \xi^k}.
Finally, by substituting for $\Gamma_{ik}^i$ in the expression for the
divergence operator
\nabla_k X^k &= X^k_{,k} + X^i\Gamma_{ki}^k \\
&= X^k_{,k} + X^k\Gamma_{ik}^i \\
&= X^k_{,k} + X^k\frac{1}{\sqrt{g}}(\sqrt{g})_{,k}
we can deduce a formula for divergence of a contravariant vector as
\nabla\cdot \bm{X} = \nabla_k X^k = \frac{\left(X^k\sqrt{g}\right)_{,k}}{\sqrt{g}}
\subsection{Laplacian operator}
The covariant derivative (gradient) of a scalar on the manifold is identical to
the partial derivative, $\nabla_k \phi = \phi_{,k}$.
To derive the Laplacian operator we need the contravariant form of the covariant
gradient above which can be found by raising the index using the metric tensor,
\nabla^k \phi = g^{kj}\phi_{,j},
and substituting \eqref{A:covar-div} for $X^k$ in \eqref{A:div} to get the Laplacian operator
on the manifold as
\Delta_M\phi = \frac{1}{\sqrt{g}}\left(\sqrt{g}g^{ij}\phi_{,j}\right)_{,i}.
\subsection{Anisotropic diffusion}
We now extend the above operator to allow for anisotropic diffusion in the domain by deriving an expression for the surface conductivity from the ambient conductivity. The gradient of a surface function scaled by the ambient conductivity tensor $\tilde{\nabla}^p$ is given by
\tilde{\nabla}^p f = g^{mp} J^l_m \sigma_{kl} J^k_j g^{ij} \frac{\partial f}{\partial x^i}.
The surface gradient is mapped to the ambient space through the Jacobian $J^k_j$, scaled by the ambient conductivity, and mapped back to the surface through $J^l_m$. The anisotropic Laplacian operator is given by
\tilde{\nabla}^2 f = \nabla_k \tilde{\nabla}^k f = \nabla_k \tilde{\sigma}^{ij} \nabla_j f
Therefore the surface conductivity tensor can be computed using the Jacobian tensor and the inverse metric as
\bm{\tilde{\sigma}} = \bm{g^{-1}J\sigma J^{\top} g^{-1}}.
\subsection{Anisotropic Laplacian operator}
Anisotropic diffusion is important in many applications. In the ambient
Euclidean space, this can be represented by a diffusivity tensor $\bm{\sigma}$
in the Laplacian operator as
\Delta_M = \nabla \cdot \bm{\sigma} \nabla.
On our manifold, we seek the generalisation of \ref{A:surf_lapl}, in the form
\tilde{\Delta}_M \phi = \nabla_j \tilde{\sigma}_i^j \nabla^i \phi.
where the $\tilde{\sigma}_i^j$ are entries in the surface diffusivity.
% %
For a contravariant surface vector $a^j$ we can find the associated
space vector $A^i$ as $A^i = J^i_j a^j$. Similarly if $A_i$ is a covariant space vector, then $a_j=J^i_j A_i$ is a covariant surface vector.
% %
Using these we can construct the anisotropic
diffusivity tensor $\tilde{\bm{\sigma}}$ on the manifold by constraining the ambient diffusivity tensor $\bm{\sigma}$ to the surface. The contravariant surface gradient $\nabla^i \phi$ is mapped to the corresponding space vector, which lies in the tangent plane to the surface. This is scaled by the ambient diffusivity and then projected back to a covariant surface vector. Finally, we use the conjugate metric to convert back to a contravariant form. The resulting surface Laplacian is
\tilde{\Delta}_M \phi = \nabla_m g^{lm}J^k_l\sigma_{jk}J^j_i \nabla^i \phi.
Following on from this we deduce that
\tilde{\sigma}_i^j = g^{jm} J^l_m \sigma_{lk} J^k_i.
It can be seen that in the case of isotropic diffusion that $\tilde{\sigma}_i^j = \delta^i_j \Leftrightarrow \bm{\sigma} = \bm{I}$,
\tilde{\sigma}_i^j = g^{im} J^k_m \sigma_{lk} J^l_j = g^{im} J^k_m J^k_j = g^{im} g_{mj} = \delta^i_j.
This directory contains two important files for all of {\nek}: NektarUnivConsts.hpp and NektarUnivTypeDefs.hpp.
The file NektarUnivConsts.hpp contains various default constants used within {\nek} as seen here:
\lstinputlisting[language=C++, firstline=47, lastline=59]{../../library/LibUtilities/BasicConst/NektarUnivConsts.hpp}
The file NektarUnivTypeDefs.hpp contains the low level typedefs such as: NekDouble, NekInt, OneD, TwoD, ThreeD, FourD, and
enumerations such as Direction (xDir, yDir and zDir) and OutputFormat.
This directory contains some of the lowest level basic computer science routines within {\nek}.
The directory currently contains the following:
\begin{tabular}{|c | c | c |} \hline
ArrayEqualityComparison.cpp & ArrayPolicies.hpp & CompressData.cpp \\ \hline
CompressData.h & ConsistentObjectAccess.hpp & CsvIO.cpp \\ \hline
CsvIO.h & Equation.cpp & Equation.h \\ \hline
ErrorUtil.hpp & FieldIO.cpp & FieldIO.h \\ \hline
FieldIOHdf5.cpp & FieldIOHdf5.h & FieldIOXml.cpp \\ \hline
FieldIOXml.h & FileSystem.cpp & FileSystem.h \\ \hline
H5.cpp & H5.h & HashUtils.hpp \\ \hline
Metis.hpp & {\bf NekFactory.hpp} & {\bf NekManager.hpp} \\ \hline
OperatorGenerators.hpp & ParseUtils.cpp & ParseUtils.h \\ \hline
Progressbar.hpp & PtsField.cpp & PtsField.h \\ \hline
PtsIO.cpp & PtsIO.h & RawType.hpp \\ \hline
Scotch.hpp & SessionReader.cpp & SessionReader.h \\ \hline
ShapeType.hpp & SharedArray.hpp & Tau.hpp \\ \hline
Thread.cpp & Thread.h & ThreadBoost.cpp \\ \hline
ThreadBoost.h & Timer.cpp & Timer.h \\ \hline
VDmath.hpp & VDmathArray.hpp & Vmath.cpp \\ \hline
{\bf Vmath.hpp} & VmathArray.hpp & VtkUtil.hpp \\ \hline
We have used {\bf bold} to denote (as examples) routines at our used throughout {\nek}. They are in this sense ``fundamental''.
Note that this list includes input/output routines (e.g. CompressData, FieldIo, H5), partitioning (e.g. Metis and Scotch) and Threading (e.g. Thread and ThreadBoost).
\ No newline at end of file
This directory contains files related to our distributed memory communication model. In particular, this directory contains files that help encapsulate MPI (Message Passing Interface) routines, as well
as the Gather-Scatter (GS) and Xxt routines of Paul Fisher (Argonne National Lab and UIUC).
\begin{tabular}{|c | c | c |} \hline
Comm.cpp & CommMpi.cpp & GsLib.hpp \\ \hline
Comm.h & CommMpi.h & Transposition.cpp \\ \hline
CommDataType.cpp & CommSerial.cpp & Transposition.h \\ \hline
CommDataType.h & CommSerial.h & Xxt.hpp \\ \hline
\ No newline at end of file
This directory contains files related to the use of Fast Fourier Transforms within {\nek}. The two groups of files are as follows:
\item NekFFTW.h/NekFFTW.cpp : Wrapper around FFTW library; and
\item NektarFFT.h/NektarFFT.cpp : Fast Fourier Transform base class in {\nek}.
\ No newline at end of file
This diff is collapsed.
This directory contains two files, AnalyticExpressionEvaluator.hpp and AnalyticExpressionEvaluator.cpp, both of which provide
parser and evaluator functionality of analytic expressions.
\ No newline at end of file
This directory contains two files, kernel.h and kernel.cpp, both of which are related to the line-SIAC (L-SIAC) filtering \cite{sanchez2016multi,jallepalli2017treatment} of FEM solutions.
SIAC filtering takes the form:
u^*(x) = \int_{-\infty}^{\infty} K_H\left(\frac{\xi - x}{H}\right) u_h(\xi) \, d\xi
\noindent where $u_h$ denotes the FEM (continuous Galkerkin or discontinuous Galerkin) solution and $K_H$ represents a kernel function:
K_H(x) = \frac{1}{H} \sum_{\gamma = 0}^M c_\gamma \psi\left(\frac{x}{H} - \zeta_\gamma \right).
In this expression, the functions $\psi(x)$ are B-Splines. The two kernel files in this directory provide all the tools needed to construct the B-Spline functions and the kernel function described above.
\ No newline at end of file
\section{Linear Algebra}
This directory contains files related to linear algebra operations. Files such as NekMatrix, BlockMatrix and NekLinearSystem are fundamental to many of the operations we do within {\nek}.
These files represent our attempt to encapsulate linear algebra operations in a way that make sense from the programmer perspective but for which we do not loose efficiency. As can be seen in these
routines and throughout the code, we rely heavily on BLAS (Basic Linear Algebra Subroutines).
\begin{tabular}{|c | c | c |} \hline
Arpack.hpp & Blas.hpp & BlasArray.hpp \\ \hline
BlockMatrix.cpp & BlockMatrix.hpp & CanGetRawPtr.hpp \\ \hline
ExplicitInstantiation.h & Lapack.hpp & MatrixBase.cpp \\ \hline
MatrixBase.hpp & MatrixFuncs.cpp & MatrixFuncs.h \\ \hline
MatrixOperations.cpp & MatrixOperations.hpp & MatrixStorageType.h \\ \hline
MatrixType.h & MatrixVectorMultiplication.cpp & NekLinAlgAlgorithms.hpp \\ \hline
NekLinSys.hpp & NekMatrix.hpp & NekMatrixFwd.hpp \\ \hline
NekPoint.hpp & NekTypeDefs.hpp & NekVector.cpp \\ \hline
NekVector.hpp & NekVectorFwd.hpp & NistSparseDescriptors.hpp \\ \hline
PointerWrapper.h & ScaledMatrix.cpp & ScaledMatrix.hpp \\ \hline
SparseDiagBlkMatrix.cpp & SparseDiagBlkMatrix.hpp & SparseMatrix.cpp \\ \hline
SparseMatrix.hpp & SparseMatrixFwd.hpp & SparseUtils.cpp \\ \hline
SparseUtils.hpp & StandardMatrix.cpp & StandardMatrix.hpp \\ \hline
StorageSmvBsr.cpp & StorageSmvBsr.hpp & TransF77.hpp \\ \hline
blas.cpp & & \\ \hline
\ No newline at end of file
This directory contains three files, NekMemoryManager.hpp, ThreadSpecificPools.hpp and ThreadSpecificPools.cpp.
The strategy used within {\nek} was to preallocate a pool of arrays that could be used for various operations and then
released back to the pool. This idea came about through profiling of the code early on -- noticing that the new/delete
operation of lots of small arrays used for temporary calculations was greatly slowing down the code. Like with our manager
idea, we decided to invest in having a memory pool object what preallocated blocks of memory that could be requested and
then returned back to the pool.
If {\nek} is compiled with \verb+NEKTAR_MEMORY_POOL_ENABLED+, the MemoryManager
allocates from thread specific memory pools for small objects. Large objects are managed with the
system supplied new/delete. These memory pools provide faster allocation and deallocation
of small objects (particularly useful for shared pointers which
allocate many 4 byte objects).
All memory allocated from the memory manager must be returned
to the memory manager. Calling delete on memory allocated from the
manager will likely cause undefined behavior. A particularly subtle
violation of this rule occurs when giving memory allocated from the
manager to a shared pointer.
This directory contains polylib.h and polylib.cpp. These files contain foundational routines used for computing various operations related to Jacobi polynomials.
The following abbreviations are used throughout the file:
\item z -- Set of collocation/quadrature points
\item w -- Set of quadrature weights
\item D -- Derivative matrix
\item h -- Lagrange Interpolant
\item I -- Interpolation matrix
\item g -- Gauss
\item k -- Kronrod
\item gr -- Gauss-Radau
\item gl -- Gauss-Lobatto
\item j -- Jacobi
\item m -- point at minus 1 in Radau rules
\item p -- point at plus 1 in Radau rules
\paragraph{Points and Weights: } The following routines are used to compute points and weights:
\item zwgj -- Compute Gauss-Jacobi points and weights
\item zwgrjm -- Compute Gauss-Radau-Jacobi points and weights ($z=-1$)
\item zwgrjp -- Compute Gauss-Radau-Jacobi points and weights ($z=1$)