Skip to content
Snippets Groups Projects
Commit e23f455a authored by 's avatar
Browse files

Merge remote-tracking branch 'upstream/master' into feature/dump-mesh-on-failure

Conflicts:
	utilities/NekMesh/InputModules/InputMCF.cpp
parents 0fe0983b 8790dcd5
No related branches found
No related tags found
No related merge requests found
Showing
with 23 additions and 5498 deletions
......@@ -3,3 +3,8 @@
path = docs/tutorial
url = git@gitlab.nektar.info:nektar/tutorial
ignore = all
[submodule "docs/developer-guide"]
branch = master
path = docs/developer-guide
url = git@gitlab.nektar.info:nektar/developer-guide
ignore = all
Changelog
=========
v4.5.0
------
**NekMesh**:
- Add periodic boundary condition meshing in 2D (!733)
- Adjust boundary layer thickness in corners in 2D (!739)
**Documentation**:
- Added the developer-guide repository as a submodule (!751)
v4.4.0
------
**Library**:
......@@ -100,9 +110,9 @@ v4.4.0
- Change variable names in mcf file to make more sense (!736)
- Fix issues in varopti module so that in can be compiled without meshgen on
(!736)
- Replace LAPACK Eigenvalue calculation with handwritten function in
- Replace LAPACK Eigenvalue calculation with handwritten function in
varopti (!738)
- Improved node-colouring algorithm for better load-balancing
- Improved node-colouring algorithm for better load-balancing
in varopti (!738)
- Simplified calculation of the energy functional in varopti for improved
performance (!738)
......
4.4.0
4.5.0
ADD_SUBDIRECTORY(user-guide)
ADD_SUBDIRECTORY(developer-guide)
IF (EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/developer-guide/CMakeLists.txt)
ADD_SUBDIRECTORY(developer-guide)
ENDIF ()
IF (EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/tutorial/CMakeLists.txt)
ADD_SUBDIRECTORY(tutorial)
......
Subproject commit e128cfaffbbd37c734a667cdc2a07b6f06291615
SET(DEVGUIDESRC ${CMAKE_CURRENT_SOURCE_DIR})
SET(DEVGUIDE ${CMAKE_BINARY_DIR}/docs/developer-guide)
FILE(MAKE_DIRECTORY ${DEVGUIDE}/html)
FIND_PROGRAM(HTLATEX htlatex)
ADD_CUSTOM_TARGET(developer-guide-html
export TEXINPUTS=${CMAKE_SOURCE_DIR}//:${DEVGUIDESRC}//: &&
${HTLATEX} ${DEVGUIDESRC}/developer-guide.tex
"${DEVGUIDESRC}/styling.cfg,html,3,next,NoFonts"
WORKING_DIRECTORY ${DEVGUIDE}/html
)
# If tex4ht successful, create img dir and copy images across
FILE(GLOB_RECURSE imgfiles "img/*.png" "img/*.jpg" "*/img/*.png" "*/img/*.jpg")
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
POST_BUILD COMMAND ${CMAKE_COMMAND} -E make_directory ${DEVGUIDE}/html/img)
FOREACH(img ${imgfiles})
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy ${img} ${DEVGUIDE}/html/img)
ENDFOREACH()
FILE(GLOB_RECURSE pdffiles "*/img/*.pdf")
FIND_PROGRAM(CONVERT convert)
FOREACH(pdf ${pdffiles})
GET_FILENAME_COMPONENT(BASENAME ${pdf} NAME_WE)
ADD_CUSTOM_COMMAND(TARGET developer-guide-html
POST_BUILD COMMAND
${CONVERT} ${pdf} ${DEVGUIDE}/html/img/${BASENAME}.png)
ENDFOREACH()
FIND_PROGRAM(PDFLATEX pdflatex)
FIND_PROGRAM(BIBTEX bibtex)
FIND_PROGRAM(MAKEINDEX makeindex)
ADD_CUSTOM_TARGET(developer-guide-pdf
export TEXINPUTS=${CMAKE_SOURCE_DIR}//: &&
${PDFLATEX} --output-directory ${DEVGUIDE} ${DEVGUIDESRC}/developer-guide.tex
COMMAND TEXMFOUTPUT=${DEVGUIDE} ${BIBTEX} ${DEVGUIDE}/developer-guide.aux
COMMAND TEXMFOUTPUT=${DEVGUIDE} ${MAKEINDEX} ${DEVGUIDE}/developer-guide.idx
COMMAND export TEXINPUTS=${CMAKE_SOURCE_DIR}//: &&
${PDFLATEX} --output-directory ${DEVGUIDE} ${DEVGUIDESRC}/developer-guide.tex
WORKING_DIRECTORY ${DEVGUIDESRC}
)
\chapter{Coding Standard}
The purpose of this page is to detail the coding standards of the project which
all contributers are requested to follow.
This page describes the coding style standard for C++. A coding style standard
defines the visual layout of source code. Presenting source code in a uniform
fashion facilitates the use of code by different developers. In addition,
following a standard prevents certain types of coding errors.
All of the items below, unless otherwise noted, are guidelines. They are
recommendations about how to lay out a given block of code. Use common sense and
provide comments to describe any deviation from the standard. Sometimes,
violating a guideline may actually improve readability.
If you are working with code that does not follow the standard, bring the code
up-to-date or follow the existing style. Don’t mix styles.
\section{Code Layout}
The aim here is to maximise readability on all platforms and editors.
\begin{itemize}
\item Code width of 80 characters maximum - hard-wrap longer lines.
\item Use sensible wrapping for long statements in a way which maximises
readability.
\item Do not put multiple statements on the same line.
\item Do not declare multiple variables on the same line.
\item Provide a default value on all variable declarations.
\item Enclose every program block (if, else, for, while, etc) in braces, even if
empty or just a single line.
\item Opening braces (\{) should be on their own line.
\item Braces at same indentation as preceeding statement.
\item One class per .cpp and .h file only, unless nested.
\item Define member functions in the .cpp file in the same order as defined in
the .h file.
\item Templated classes defined and implemented in a single .hpp file.
\item Do not put inline functions in the header file unless the function is
trivial (e.g. accessor, empty destructor), or profiling explicitly suggests to.
\item Inline functions should be declared within the class declaration but
defined outside the class declaration at the bottom of the header file.
\begin{notebox}
Virtual and inline are mutually exclusive. Virtual functions should therefore be
implemented in the .cpp file.
\end{notebox}
\end{itemize}
\section{Space}
Adding an appropriate amount of white space enhances readability. Too much white
space, on the other hand, detracts from that readability.
\begin{itemize}
\item Indent using a four-space tab. Consistent tab spacing is necessary to
maintain formatting. Note that this means when a tab is pressed, four physical spaces are
inserted into the source instead.
\item Put a blank line at the end of a public/protected/private block.
\item Put a blank line at the end of every file.
\item Put a space after every keyword (if, while, for, etc.).
\item Put a space after every comma, unless the comma is at the end of the line.
\item Do not put a space before the opening parenthesis of an argument list to a
function.
\item Declare pointers and references with the * or \& symbol next to the
declarator, not the type; e.g., Object *object. Do not put multiple variables in the same
declaration.
\item Place a space on both sides of a binary operator.
\item Do not use a space to separate a unary operator from its operand.
\item Place open and close braces on their own line. No executable statements
should appear on the line with the brace, but comments are allowed. Indent opening
braces at the same level as the statement above and indent the closing brace at
the same level as the corresponding opening brace.
\item Indent all statements following an open brace by one tab. Developer Studio
puts any specifier terminated with a colon at the same indentation level as the
enclosing brace. Examples of such specifiers include case statements, access
specifiers (public, private, protected), and goto labels. This is not acceptable
and should be manually corrected so that all statements appearing within a block
and delineated by braces are indented.
\item Break a line into multiple lines when it becomes too long to read. Use at
least two tabs to start the new line, so it does not look like the start of a
block.
\item Follow C++ style comments with one space. It is also preferable to
consider any text that follows C++ style comments as a sentence and to begin this text
with a capital letter. This helps to distinguish the line from a continuation of
a previous line; i.e., \inlsh{// This is my comment.}
\item As a general rule, don’t keep commented out source code in the final
baselined product. Such code leads the reader to believe there was uncertainty
in the code as it currently exists.
\item Place the \# of a preprocessor directive at column one. An exception is
the use of nested ifdefs where the bodies only contain other preprocessor directives.
Add tabs to enhance readability:
\begin{lstlisting}[style=C++Style]
void foo() {
for(int i = 0; i < 10; ++i)
{
#ifdef BAR
do_something();
#endif
for_loop_code();
}
}
\end{lstlisting}
\item Use tabular white space if it enhances
readability.
\item Use only one return statement. Structure the code so that only one return
statement is necessary.
\end{itemize}
\section{Naming Conventions}
Keep variable and function names meaningful but concise.
\begin{itemize}
\item Begin variable names with lower-case letter.
\item Begin function names and class names with upper-case letter.
\item All function, class and variable names should be written in CamelCase,
e.g. \inlsh{MyClass, DoFunction() or myVariableName}.
\item All preprocessor definitions written in UPPER\_CASE with words separated
by underscores, e.g. USE\_SPECIFIC\_FEATURE.
\item All member variables prefixed with m\_.
\item All constants prefixed with a k.
\item All function parameters prefixed with a p.
\item All enumerations prefixed with an e.
\item Do not use leading underscores.
\end{itemize}
\section{Namespaces}
The top-level namespace is "Nektar". All code should reside in this namespace or
a sub-space of this.
\begin{itemize}
\item Namespaces correspond to code structure.
\item Namespaces should be kept to a minimum to simplify the interface to their
contents.
\end{itemize}
\section{Documentation}
\begin{itemize}
\item Briefs for classes, functions and types in header files using
\inlsh{///} notation.
\item Full documentation with implementation using \inlsh{/** ... *\/}
notation.
\item Use @ symbol for @class, @param, @returns, etc for ease of identification.
\item Any separate documentation pages not directly associated with a portion of
the code should be in a separate file in /docs/html/doxygen.
\end{itemize}
\chapter{Core Concepts}
This section describes some of the key concepts which are useful when developing
code within the Nektar++ framework.
\section{Factory method pattern}
The factory method pattern is used extensively throughout Nektar++ as a
mechanism to instantiate objects. It provides the following benefits:
\begin{itemize}
\item Encourages modularisation of code such that conceptually related
algorithms are grouped together
\item Structuring of code such that different implementations of the same
concept are encapsulated and share a common interface
\item Users of a factory-instantiated modules need only be concerned with the
interface and not the details of underlying implementations
\item Simplifies debugging since code relating to a specific implementation
resides in a single class
\item The code is naturally decoupled to reduce header-file dependencies and
improves compile times
\item Enables implementations (e.g. relating to third-party libraries) to be
disabled through the build process (CMake) by not compiling a specific
implementation, rather than scattering preprocessing statements throughout the
code
\end{itemize}
For conceptual details see the Wikipedia page.
%\url{http://en.wikipedia.org/wiki/Factory_pattern}.
\subsection{Using NekFactory}
The templated NekFactory class implements the factory pattern in Nektar++.
There are two distinct aspects to creating a factory-instantiated collection of
classes: defining the public interface, and registering specific
implementations. Both of these involve adding standard boilerplate code. It is
assumed that we are writing a code which implements a particular concept or
functionality within the code, for which there are multiple implementations. The
reasons for multiple implementations may be very low level such as alternative
algorithms for solving a linear system, or high level, such as selecting from a
range of PDEs to solve.
\subsubsection{Creating an interface (base class)}
A base class must be defined which prescribes an implementation-independent
interface. In Nektar++, the template method pattern is used, requiring public
interface functions to be defined which call private virtual implementation
methods. The latter will be overridden in the specific implementation classes.
In the base class these virtual methods should be defined as pure virtual, since
there is no implementation and we will not be instantiating this base class
explicitly.
As an example we will create a factory for instantiating different
implementations of some concept \inlsh{MyConcept}, defined in
\inlsh{MyConcept.h} and \inlsh{MyConcept.cpp}. First in \inlsh{MyConcept.h},
we need to include the NekFactory header
\begin{lstlisting}[style=C++Style]
#include <LibUtilities/BasicUtils/NekFactory.hpp>
\end{lstlisting}
The following code should then be included just before the base class
declaration (in the same namespace as the class):
\begin{lstlisting}[style=C++Style]
class MyConcept
// Datatype for the MyConcept factory
typedef LibUtilities::NekFactory< std::string, MyConcept,
ParamType1,
ParamType2 > MyConceptFactory;
MyConceptFactory& GetMyConceptFactory();
\end{lstlisting}
The template parameters define the datatype of the key used to retrieve a
particular implementation (usually a string, enum or custom class such as
\inlsh{MyConceptKey}, the base class (in our case \inlsh{MyConcept} and a list
of zero or more parameters which are taken by the constructors of all
implementations of the type \inlsh{MyConcept} (in our case we have two). Note
that all implementations must take the same parameter list in their constructors.
The normal definition of our base class then follows:
\begin{lstlisting}[style=C++Style]
class MyConcept
{
public:
MyConcept(ParamType1 p1, ParamType2 p2);
...
};
\end{lstlisting}
We must also define a shared pointer for our base class for use later
\begin{lstlisting}[style=C++Style]
typedef boost::shared_ptr<MyConcept> MyConceptShPtr;
\end{lstlisting}
\subsubsection{Creating a specific implementation (derived class)}
A class is defined for each specific implementation of a concept. It is these
specific implementations which are instantiated by the factory.
In our example we will have an implementations called \inlsh{MyConceptImpl1}
defined in \inlsh{MyConceptImpl1.h} and \inlsh{MyConceptImpl1.cpp}. In the
header file we include the base class header file
\begin{lstlisting}[style=C++Style]
#include <Subdir/MyConcept.h>
\end{lstlisting}
We then define the derived class as normal:
\begin{lstlisting}[style=C++Style]
class MyConceptImpl1 : public MyConcept
{
...
};
\end{lstlisting}
In order for the factory to work, it must know
\begin{itemize}
\item that {{{MyConceptImpl1}}} exists, and
\item how to create it.
\end{itemize}
To allow the factory to create instances of our class we define a function in
our class:
\begin{lstlisting}[style=C++Style]
/// Creates an instance of this class
static MyConceptSharedPtr create(
ParamType1 p1,
ParamType2 p2)
{
return MemoryManager<MyConceptImpl1>::AllocateSharedPtr(p1, p2);
}
\end{lstlisting}
This function simply creates an instance of \inlsh{MyConceptImpl1} using the
supplied parameters. It must be \inlsh{static} because we are not operating on
an existing instance and it should return a base class shared pointer (rather
than a \inlsh{MyConceptImpl1} shared pointer), since the point of the factory
is that the calling code does not know about specific implementations.
The last task is to register our implementation with the factory. This is done
using the \inlsh{RegisterCreatorFunction} member function of the factory.
However, we wish this to happen as early on as possible (so we can use the
factory straight away) and without needing to explicitly call the function for
every implementation at the beginning of our program (since this would again
defeat the point of a factory)! The solution is to use the function to
initialise a static variable: it will be executed prior to the start of the
\inlsh{main()} routine, and can be located within the very class it is
registering, satisfying our code decoupling requirements.
In \inlsh{MyConceptImpl1.h} we define a static variable with the same datatype
as the key used in our factory (in our case \inlsh{std::string})
\begin{lstlisting}[style=C++Style]
static std::string className;
\end{lstlisting}
The above variable can be \inlsh{private} since it is typically never actually
used within the code. We then initialise it in \inlsh{MyConceptImpl1.cpp}
\begin{lstlisting}[style=C++Style]
string MyConceptImpl1::className
= GetMyConceptFactory().RegisterCreatorFunction(
"Impl1",
MyConceptImpl1::create,
"First implementation of my concept.");
\end{lstlisting}
The first parameter specifies the value of the key which should be used to
select this implementation. The second parameter is a function pointer to our
static function used to instantiate our class. The third parameter provides a
description which can be printed when listing the available MyConcept
implementations.
\subsection{Instantiating classes}
To create instances of MyConcept implementations elsewhere in the code, we must
first include the ''base class'' header file
\begin{lstlisting}[style=C++Style]
#include <Subdir/MyConcept.h>
\end{lstlisting}
Note we do not include the header files for the specific MyConcept
implementations anywhere in the code (apart from \inlsh{MyConceptImpl1.cpp}).
If we modify the implementation, only the implementation itself requires
recompiling and the executable relinking.
We create an instance by retrieving the \inlsh{MyConceptFactory} and call the
\inlsh{CreateInstance} member function of the factory:
\begin{lstlisting}[style=C++Style]
ParamType p1 = ...;
ParamType p2 = ...;
MyConceptShPtr p = GetMyConceptFactory().CreateInstance( "Impl1", p1, p2 );
\end{lstlisting}
Note that the class is used through the pointer \inlsh{p}, which is of type
\inlsh{MyConceptShPtr}, allowing the use of any of the public interface
functions in the base class (and therefore the specific implementations behind them) to be
called, but not directly any functions declared solely in a specific
implementation.
\section{NekArray}
An Array is a thin wrapper around native arrays. Arrays provide all the
functionality of native arrays, with the additional benefits of automatic use of
the Nektar++ memory pool, automatic memory allocation and deallocation, bounds
checking in debug mode, and easier to use multi-dimensional arrays.
Arrays are templated to allow compile-time customization of its dimensionality
and data type.
Parameters:
\begin{itemize}
\item \inltt{Dim} Must be a type with a static unsigned integer called
\inltt{Value} that specifies the array's dimensionality. For example
\begin{lstlisting}[style=C++Style]
struct TenD {
static unsigned int Value = 10;
};
\end{lstlisting}
\item \inltt{DataType} The type of data to store in the array.
\end{itemize}
It is often useful to create a class member Array that is shared with users of
the object without letting the users modify the array. To allow this behavior,
Array<Dim, !DataType> inherits from Array<Dim, const !DataType>. The following
example shows what is possible using this approach:
\begin{lstlisting}[style=C++Style]
class Sample {
public:
Array<OneD, const double>& getData() const { return m_data; }
void getData(Array<OneD, const double>& out) const { out = m_data; }
private:
Array<OneD, double> m_data;
};
\end{lstlisting}
In this example, each instance of Sample contains an array. The getData
method gives the user access to the array values, but does not allow
modification of those values.
\subsection{Efficiency Considerations}
Tracking memory so it is deallocated only when no more Arrays reference it does
introduce overhead when copying and assigning Arrays. In most cases this loss of
efficiency is not noticeable. There are some cases, however, where it can cause
a significant performance penalty (such as in tight inner loops). If needed,
Arrays allow access to the C-style array through the \texttt{Array::data} member
function.
\section{Threading}
\begin{notebox}
Threading is not currently included in the main code distribution. However, this
hybrid MPI/pthread functionality should be available within the next few months.
\end{notebox}
We investigated adding threaded parallelism to the already MPI parallel
Nektar++. MPI parallelism has multiple processes that exchange data using
network or network-like communications. Each process retains its own memory
space and cannot affect any other process’s memory space except through the MPI
API. A thread, on the other hand, is a separately scheduled set of instructions
that still resides within a single process’s memory space. Therefore threads
can communicate with one another simply by directly altering the process’s
memory space. The project's goal was to attempt to utilise this difference to
speed up communications in parallel code.
A design decision was made to add threading in an implementation independent
fashion. This was achieved by using the standard factory methods which
instantiate an abstract thread manager, which is then implemented by a concrete
class. For the reference implementation it was decided to use the Boost library
rather than native p-threads because Nektar++ already depends on the Boost
libraries, and Boost implements threading in terms of p-threads anyway.
It was decided that the best approach would be to use a thread pool. This
resulted in the abstract classes ThreadManager and ThreadJob. ThreadManager is
a singleton class and provides an interface for the Nektar++ programmer to
start, control, and interact with threads. ThreadJob has only one method, the
virtual method run(). Subclasses of ThreadJob must override run() and provide a
suitable constructor. Instances of these subclasses are then handed to the
ThreadManager which dispatches them to the running threads. Many thousands of
ThreadJobs may be queued up with the ThreadManager and strategies may be
selected by which the running threads take jobs from the queue. Synchronisation
methods are also provided within the ThreadManager such as wait(), which waits
for the thread queue to become empty, and hold(), which pauses a thread that
calls it until all the threads have called hold(). The API was thoroughly
documented in Nektar++’s existing Javadoc style.
Classes were then written for a concrete implementation of ThreadManager using
the Boost library. Boost has the advantage of being available on all Nektar++’s
supported platforms. It would not be difficult, however, to implement
ThreadManager using some other functionality, such as native p-threads.
Two approaches to utilising these thread classes were then investigated. The
bottom-up approach identifies likely regions of the code for parallelisation,
usually loops around a simple and independent operation. The top-down approach
seeks to run as much of the code as is possible within a threaded environment.
The former approach was investigated first due to its ease of implementation.
The operation chosen was the multiplication of a very large sparse block
diagonal matrix with a vector, where the matrix is stored as its many smaller
sub matrices. The original algorithm iterated over the sub matrices multiplying
each by the vector and accumulating the result. The new parallel algorithm
sends ThreadJobs consisting of batches of sub matrices to the thread pool. The
worker threads pick up the ThreadJobs and iterate over the sub matrices in the
job accumulating the result in a thread specific result vector. This latter
detail helps to avoid the problem of cache ping-pong which is where multiple
threads try to write to the same memory location, repeatedly invalidating one
another's caches.
Clearly this approach will work best when the sub matrices are large and there
are many of them . However, even for test cases that would be considered large
it became clear that the code was still spending too much time in its scalar
regions.
This led to the investigation of the top-down approach. Here the intent is to
run as much of the code as possible in multiple threads. This is a much more
complicated approach as it requires that the overall problem can be partitioned
suitably, that a mechanism be available to exchange data between the threads,
and that any code using shared resources be thread safe. As Nektar++ already
has MPI parallelism the first two requirements (data partitioning and exchange)
are already largely met. However since MPI parallelism is implemented by having
multiple independent processes that do not share memory space, global data in
the Nektar++ code, such as class static members or singleton instances, are now
vulnerable to change by all the threads running in a process.
To Nektar++’s communication class, Comm, was added a new class, ThreadedComm.
This class encapsulates a Comm object and provides extra functionality without
altering the API of Comm (this is the Decorator pattern). To the rest of the
Nektar++ library this Comm object behaves the same whether it is a purely MPI
Comm object or a hybrid threading plus MPI object. The existing data
partitioning code can be used with very little modification and the parts of the
Nektar++ library that exchange data are unchanged. When a call is made to
exchange data with other workers ThreadedComm first has the master thread on
each process (i.e. the first thread) use the encapsulated Comm object (typically
an MPI object) to exchange the necessary data between the other processes, and
then exchanges data with the local threads using direct memory to memory copies.
As an example: take the situation where there are two processes A and B,
possibly running on different computers, each with two threads 1 and 2. A
typical data exchange in Nektar++ uses the Comm method AllToAll(...) in which
each worker sends data to each of the other workers. Thread A1 will send data
from itself and thread A2 via the embedded MPI Comm to thread B1, receiving in
turn data from threads B1 and B2. Each thread will then pick up the data it
needs from the master thread on its process using direct memory to memory
copies. Compared to the situation where there are four MPI processes the number
of communications that actually pass over the network is reduced. Even MPI
implementations that are clever enough to recognise when processes are on the
same host must make a system call to transfer data between processes.
The code was then audited for situations where threads would be attempting to
modify global data. Where possible such situations were refactored so that each
thread has a copy of the global data. Where the original design of Nektar++ did
not permit this access to global data was mediated through locking and
synchronisation. This latter approach is not favoured except for global data
that is used infrequently because locking reduces concurrency.
The code has been tested and Imperial College cluster cx1 and has shown good
scaling. However it is not yet clear that the threading approach outperforms
the MPI approach; it is possible that the speedups gained through avoiding
network operations are lost due to locking and synchronisation issues. These
losses could be mitigated through more in-depth refactoring of Nektar++.
\ No newline at end of file
\section{Connectivity}
The typical elemental decomposition of the spectral/hp element method requires a
global assembly process when considering multi-elemental problems. This global
assembly will ensure some level of connectivity between adjacent elements sucht
that there is some form of continuity across element boundaries in the global
solution. In this section, we will merely focus on the classical Galerkin
method, where global continuity is typically imposed by making the approximation
$C^0$ continuous.
\subsection{Connectivity in two dimensions}
As explained in \cite{KaSh05}, the global assembly process involves the
transformation from local degrees of freedom to global degrees of freedom
(DOF). This transformation is typically done by a mapping array which relates
the numbering of the local (= elemental) DOF's to the numbering of the global
DOF's. To understand how this transformation is set up in Nektar++ one should
understand the following:
\begin{itemize}
\item \textbf{Starting point}
The starting point is the initial numbering of the elemental expansion modes.
This corresponds to the order in which the different local expansion modes
are listed in the coefficient array \texttt{m\_coeffs} of the elemental
(local or standard) expansion. The specific order in which the different elemental
expansion modes appear is motivated by the compatability with the
sum-factorisation technique. This also implies that this ordering is fixed
and should not be changed by the user. Hence, this unchangeable initial local
numbering will serve as starting input for the connectivity.
\item \textbf{end point}
Obviously, we are working towards the numbering of the global DOF's. This
global ordering should:
\begin{itemize}
\item reflect the chosen continuity approach (standard $C^0$ Galerkin in our case)
\item (optionally) have some optimal ordering (where optimality can
be defined in different ways, e.g. minimal bandwith)
\end{itemize}
\end{itemize}
All intermittent steps from starting point to end point can basically be chosen
freely but they should allow for an efficient construction of the global
numbering system starting from the elemental ordering of the local degrees of
freedom. Currently, Nektar++ provides a number of tools and routines in the
different sublibraries which can be employed to set up the mapping from local to
global DOF's. These tools will be listed below, but first the connectivity
strategies for both modal and nodal expansions will be explained. Note that all
explanations below are focussed on quadrilateral elements. However, the general
idea equally holds for triangles.
\subsection{Connectivity strategies}
For a better understanding of the described strategies, one should first
understand how Nektar++ deals with the basic geometric concepts such as edges
and (2D) elements.
In Nektar++, a (2D) element is typically defined by a set of edges. In the input
.xml files, this should be done as (for a quadrilateral element):
\begin{lstlisting}[style=XMLStyle]
<Q ID="i"> e0 e1 e2 e3</Q>
\end{lstlisting}
where \inltt{e0} to \inltt{e3} correspond to the mesh ID's of the different
edges.
It is important to know that in Nektar++, the convention is that these edges
should be ordered counterclokwise (Note that this order also corresponds to the
order in which the edges are passed to the constructors of the 2D geometries).
In addition, note that we will refer to edge \inltt{e0} as the edge with local
(elemental) edge ID equal to 0, to edge \inltt{e1} as local edge with ID 1, to
edge \inltt{e2} as local edge with ID equal 2 and to edge \inltt{e3} as local
edge with ID 3.
Furthermore, one should note that the local coordinate system is orientated such
that the first coordinate axis is aligned with edge 0 and 2 (local edge ID), and
the second coordinate axis is aligned with edge 1 and 3. The direction of these
coordinate axis is such that it points in counterclockwise direction for edges 0
and 1, and in clockwise direction for edge 2 and 3.
Another important feature in the connectivity strategy is the concept of edge
orientation. For a better understanding, consider the input format of an edge as
used in the input .xml files which contain the information about the mesh. An
edge is defined as:
\begin{lstlisting}[style=XMLStyle]
<E ID="i"> v0 v1 </E>
\end{lstlisting}
where \inltt{v0} and \inltt{v1} are the ID's of the two vertices that define
the edge (Note that these vertices are passed in the same order to the constructor
of the edge). Now, the orientation of an edge of a two-dimensional element (i.e.
quadrilateral or triangle) is defined as:
\begin{itemize}
\item Forward if the vertex with ID \inltt{v0} comes before the vertex with ID
\inltt{v1} when considering the vertices the vertices of the element in a
counterclockwise direction
\item Backward otherwise.
\end{itemize}
This has the following implications:
\begin{itemize}
\item The common edge of two adjacent elements has always a forward orientation
for one of the elements it belongs to and a backward orientation for the other.
\item The orientation of an edge is only relevant when considering
two-dimensional elements. It is a property which is not only inherent to the edge itself, but
depends on the element it belongs to. (This also means that a segment does not
have an orientation)
\end{itemize}
\subsubsection{Modal expansions}
We will follow the basic principles of the connectivity strategy as explained in
Section 4.2.1.1 of \cite{KaSh05} (such as the hierarchic ordering of the edge
modes). However, we do not follow the strategy described to negate the odd modes
of an intersecting edge of two adjacent elements if the local coordinate systems
have an opposite direction. The explained strategy involves checking the
direction of the local coordinate systems of the neighbouring elements. However,
for a simpler automatic procedure to identify which edges need to have odd mode
negated, we would like to have an approach which can be applied to the elements
individually, without information being coupled between neighbouring elements.
This can be accomplished in the following way. Note that this approach is based
on the earlier observation that the intersecting edge of two elements always has
opposite orientation. Proper connectivity can now be guaranteed if:
\begin{itemize}
\item forward oriented edges always have a counterclockwise local coordinate
axis
\item backward oriented edges always have a clockwise local coordinate axis.
\end{itemize}
Both the local coordinate axis along an intersecting edge will then point in the
same direction. Obviously, these conditions will not be fulfilled by default.
But in order to do so, the direction of the local coordinate axis should be
reversed in following situations:
\begin{lstlisting}[style=C++Style]
if ((LocalEdgeId == 0)||(LocalEdgeId == 1)) {
if( EdgeOrientation == Backward ) {
change orientation of local coordinate axis
}
}
if ((LocalEdgeId == 2)||(LocalEdgeId == 3)) {
if( EdgeOrientation == Forward ) {
change orientation of local coordinate axis
}
}
\end{lstlisting}
This algorithm above is based on the earlier observation that the local
coordinate axis automatically point in counterclockwise direction for edges 0
and 1 and in clockwise direction for the other edges. As explained in \cite{KaSh05}
the change in local coordinate axis can actually be done by reversing the sigqn
of the odd modes. This is implemented by means of an additional sign vector.
\subsubsection{Nodal expansions}
For the nodal expansions, we will use the connectivity strategy as explained in
Section 4.2.1.1 of \cite{KaSh05}. However, we will clarify this strategy from a
Nektar++ point of view. As pointed out in \cite{KaSh05}, the nodal edge modes
can be identified with the physical location of the nodal points. In order to ensure
proper connectivity between elements the egde modes with the same nodal location
should be matched. This will be accomplished if both the sets of local edge
modes along the intersection edge of two elements are numbered in the same
direction. And as the intersecting edge of two elements always has opposite
direction, this can be guaranteed if:
\begin{itemize}
\item the local numbering of the edge modes is counterclockwise for forward
oriented edges
\item the local numbering of the edge modes is clockwise for backward oriented
edges.
\end{itemize}
This will ensure that the numbering of the global DOF's on an edge is in the
same direction as the tow subsets of local DOF's on the intersecting edge.
\subsection{Implementation}
The main entity for the transformation from local DOF's to global DOF's (in 2D)
is the LocalToGlobalMap2D class. This class basically is the abstraction of the
mapping array map[e][i] as introduced in section 4.2.1 of \cite{KaSh05}. This
mapping array is contained in the class' main data member,
LocalToGlobalMap2D::m\_locToContMap. Let us recall what this mapping array
\begin{lstlisting}
map[e][i] = globalID
\end{lstlisting}
actually represents:
\begin{itemize}
\item e corresponds to the e th element
\item i corresponds to the i th expansion
mode within element e. This index i in this map array corresponds to the index of
the coefficient array m\_coeffs.
\item globalID represents the ID of the corresponding global degree of freedom.
\end{itemize}
However, rather than this two-dimensional structure of the mapping array,\\
\texttt{LocalToGlobalMap2D::m\_locToContMap} stores the mapping array as a
one-dimensional array which is the concatenation of the different elemental
mapping arrays map[e]. This mapping array can then be used to assemble the
global system out of the local entries, or to do any other transformation
between local and global degrees of freedom (Note that other mapping arrays such
as the boundary mapping bmap[e][i] or the sign vector sign[e][i] which might be
required are also contained in the LocalToGlobalMap2D class).
For a better appreciation of the implementation of the connectivity in Nektar++,
it might be useful to consider how this mapping array is actually being
constructed (or filled). To understand this, first consider the following:
\begin{itemize}
\item The initial local elemental numbering (referred to as starting point in
the beginning of this document) is not suitable to set up the mapping. In no way
does it correspond to the local numbering required for a proper connectivity as
elaborated in Section 4.2.1 of \cite{KaSh05}. Hence, this initial ordering
asuch cannot be used to implement the connectivity strategies explained above. As a
result, additional routines (see here), which account for some kind of
reordering of the local numbering will be required in order to construct the
mapping array properly.
\item Although the different edge modes can be thought of as to include both
the vertex mode, we will make a clear distinction between them in the
implementation. In other words, the vertex modes will be treated separately
from the other modes on an edge as they are conceptually different from an
connectivity point of view. We will refer to these remaining modes as interior
edge modes.
\end{itemize}
The fill-in of the mapping array can than be summarised by the following part of
(simplified) code:
\begin{lstlisting}[style=C++Style]
for(e = 0; e < Number_Of_2D_Elements; e++) {
for(i = 0; i < Number_Of_Vertices_Of_Element_e; i++) {
offsetValue = ...
map[e][GetVertexMap(i)] = offsetValue;
}
for(i = 0; i < Number_Of_Edges_Of_Element_e; i++) {
localNumbering = GetEdgeInteriorMap(i); offsetValue = ...
for(j = 0; j < Number_Of_InteriorEdgeModes_Of_Edge_i; j++) {
map[e][localNumbering(j)] = offsetValue + j;
}
}
}
\end{lstlisting}
In this document, we will not cover how the calculate the offsetValue which:
\begin{itemize}
\item for the vertices, corresponds to the global ID of the specific vertex
\item for the edges, corresponds to the starting value of the global numbering on the
concerned edge
\end{itemize}
However, we would like to focus on the routines GetVertexMap() and
GetEdgeInteriorMap(), 2 functions which somehow reorder the initial local
numbering in order to be compatible with the connectivity strategy.
\subsubsection{GetVertexMap()}
Given the local vertex id (i.e. 0,1,2 or 3 for a quadrilateral element), it
returns the position of the corresponding vertex mode in the elemental
coefficient array StdRegions::StdExpansion::m\_coeffs. By using this function as
in the code above, it is ensured that the global ID of the vertex is entered in
the correct position in the mapping array.
\subsubsection{GetEdgeInteriorMap()}
Like the previous routine, this function is also defined for all two dimensional
expanions in the StdRegions::StdExpansion class tree. This is actually the most
important function to ensure proper connectivity between neigbouring elements.
It is a function which reorders the numbering of local DOF's according to the
connectivity strategy. As input this function takes:
\begin{itemize}
\item the local edge ID of the edge to be considered
\item the orientation of this edge.
\end{itemize}
As output, it returns the local ordering of the requested (interior) edge modes.
This is contained
in an array of size N-2, where N is the number of expansion modes in the
relevant direction. The entries in this array represent the position of the
corresponding interior edge mode in the elemental coefficient array
StdRegions::StdExpansion::m\_coeffs.
Rather than the actual values of the local numbering, it is the ordering of
local edge modes which is of importance for the connectivity. That is why it is
important how the different interior edge modes are sorted in the returned
array. This should be such that for both the elements which are connected by the
intersecting edge, the local (interior) edge modes are iterated in the same
order. This will guarantee a correct global numbering scheme when employing the
algorithm shown above. This proper connectivity can be ensured if the function
GetEdgeInteriorMap:
\begin{itemize}
\item for modal expansions: returns the edge interior modes in hierarchical
order (i.e. the lowest polynomial order mode first),
\item for nodal expansions: returns the edge interior modes in:
\begin{itemize}
\item counterclockwise order for forward oriented edges
\item clockwise order for backward oriented edges.
\end{itemize}
\end{itemize}
\chapter{Data Structures and Algorithms}
\label{ch:data}
\input{connectivity}
\input{time-integration}
\input{preconditioners}
\ No newline at end of file
\section{Preconditioners}
\label{sec:precon}
Most of the solvers in \nekpp, including the incompressible Navier-Stokes
equations, rely on the solution of a Helmholtz equation,
%
\begin{equation}
\nabla^{2}u(\mathbf{x})+\lambda u(\mathbf{x})=f(\mathbf{x}),
\label{eq:precon:helm}
\end{equation}
%
an elliptic boundary value problem, at every time-step, where $u$ is defined on
a domain $\Omega$ of $N_{\mathrm{el}}$ non-overlapping elements. In this
section, we outline the preconditioners which are implemented in \nekpp. Whilst
some of the preconditioners are generic, many are especially designed for the
\emph{modified} basis only.
\subsection{Mathematical formulation}
The standard spectral/$hp$ approach to discretise \eqref{eq:precon:helm} starts
with an expansion in terms of the elemental modes:
%
\begin{equation}
u^{\delta}(\mathbf{x})=\sum_{n=0}^{N_{\mathrm{dof}}-1}\hat{u}_n
\Phi_n(\mathbf{x})=\sum_{e=1}^{{N_{\mathrm{el}}}}
\sum_{n=0}^{N^{e}_m-1}\hat{u}_n^e\phi_n^e(\mathbf{x})
\label{eq:precon:disc}
\end{equation}
%
where $N_{\mathrm{el}}$ is the number of elements, $N^{e}_m$ is the number of
local expansion modes within the element $\Omega^e$, $\phi_n^e(\mathbf{x})$ is
the $n^{\mathrm{th}}$ local expansion mode within the element $\Omega^e$,
$\hat{u}_n^e$ is the $n^{\mathrm{th}}$ local expansion coefficient within the
element $\Omega^e$. Approximating our solution by~\eqref{eq:precon:disc}, we
adopt a Galerkin discretisation of equation~\eqref{eq:precon:helm} where for an
appropriate test space $V^\delta$ we find an approximate solution
$\mathbf{u}^{\delta} \in V^{\delta}$ such that
%
\[
\mathcal L \left({v, u}\right) = \int_{\Omega}\nabla v^{\delta} \cdot \nabla
u^{\delta} + \lambda v^{\delta} u^{\delta} d \mathbf{x} = \int_{\Omega}
v^{\delta} f d\mathbf{x} \quad \forall v^{\delta} \in V^{\delta}
\]
%
This can be formulated in matrix terms as
%
\[
\mathbf{H}\hat{\mathbf{u}} = \mathbf{f}
\]
%
where $\mathbf{H}$ represents the Helmholtz matrix, $\hat{\mathbf{u}}$ are the
unknown global coefficients and $\mathbf{f}$ the inner product the expansion
basis with the forcing function.
\subsubsection{$C^0$ formulation}
We first consider the $C^0$ (i.e. continuous Galerkin) formulation. The
spectral/$hp$ expansion basis is obtained by considering interior modes, which
have support in the interior of the element, separately from boundary modes
which are non-zero on the boundary of the element. We align the boundary modes
across the interface of the elements to obtain a continuous global solution. The
boundary modes can be further decomposed into vertex, edge and face modes,
defined as follows:
\begin{itemize}
\item vertex modes have support on a single vertex and the three adjacent
edges and faces as well as the interior of the element;
\item edge modes have support on a single edge and two adjacent faces as well
as the interior of the element;
\item face modes have support on a single face and the interior of the
element.
\end{itemize}
When the discretisation is continuous, this strong coupling between vertices,
edges and faces leads to a matrix of high condition number $\kappa$. Our aim is
to reduce this condition number by applying specialised
preconditioners. Utilising the above mentioned decomposition, we can write the
matrix equation as:
%
\[
\left[\begin{array}{cc}
\mathbf{H}_{bb} & \mathbf{H}_{bi}\\
\mathbf{H}_{ib} & \mathbf{H}_{ii}
\end{array}\right]
\left[ \begin{array}{c}
\hat{\mathbf{u}}_{b}\\
\hat{\mathbf{u}}_{i}\\
\end{array}\right] =
\left[ \begin{array}{c}
\hat{\mathbf{f}}_{b}\\
\hat{\mathbf{f}}_{i}\\
\end{array}\right]
\]
%
where the subscripts $b$ and $i$ denote the boundary and interior degrees of
freedom respectively. This system then can be statically condensed allowing us
to solve for the boundary and interior degrees of freedom in a decoupled
manor. The statically condensed matrix is given by
%
\[
\left[\begin{array}{cc}
\mathbf{H}_{bb}-\mathbf{H}_{bi} \mathbf{H}_{ii}^{-1} \mathbf{H}_{ib} & 0\\
\mathbf{H}_{ib} & \mathbf{H}_{ii}
\end{array}\right]
\left[ \begin{array}{c}
\hat{\mathbf{u}}_{b}\\
\hat{\mathbf{u}}_{i}\\
\end{array}\right] =
\left[ \begin{array}{c}
\hat{\mathbf{f}}_{b}-\mathbf{H}_{bi} \mathbf{H}_{ii}^{-1}\hat{\mathbf{f}}_{i}\\
\hat{\mathbf{f}}_{i}\\
\end{array}\right]
\]
%
This is highly advantageous since by definition of our interior expansion this
vanishes on the boundary, and so $\mathbf{H}_{ii}$ is block diagonal and thus
can be easily inverted. The above sub-structuring has reduced our problem to
solving the boundary problem:
%
\[
\mathbf{S}_{1}\hat{\mathbf{u}} = \hat{\mathbf{f}}_{1}
\]
%
where
$\mathbf{S_{1}}=\mathbf{H}_{bb}-\mathbf{H}_{bi} \mathbf{H}_{ii}^{-1}
\mathbf{H}_{ib}$
and
$\hat{\mathbf{f}}_{1}=\hat{\mathbf{f}}_{b}-\mathbf{H}_{bi}
\mathbf{H}_{ii}^{-1}\hat{\mathbf{f}}_{i}$.
Although this new system typically has better convergence properties (i.e lower
$\kappa$), the system is still ill-conditioned, leading to a convergence rate of
the conjugate gradient (CG) routine that is prohibitively slow. For this reason
we need to precondition $\mathbf{S}_1$. To do this we solve an equivalent
system of the form:
%
\[
\mathbf{M}^{-1}\left(\mathbf{S}_{1} \hat{\mathbf{u}} - \hat{\mathbf{f}}_{1} \right) = 0
\]
%
where the preconditioning matrix $\mathbf{M}$ is such that
$\kappa\left(\mathbf{M}^{-1} \mathbf{S}_{1}\right)$ is less than
$\kappa\left(\mathbf{S}_{1}\right)$ and speeds up the convergence rate. Within
the conjugate gradient routine the same preconditioner $\mathbf{M}$ is applied
to the residual vector $\hat{\mathbf{r}}_{k+1}$ of the CG routine every
iteration:
%
\[
\hat{\mathbf{z}}_{k+1}=\mathbf{M}^{-1}\hat{\mathbf{r}}_{k+1}.
\]
%
\subsubsection{HDG formulation}
When utilising a hybridizable discontinuous Galerkin formulation, we perform a
static condensation approach but in a discontinuous framework, which for brevity
we omit here. However, we still obtain a matrix equation of the form
%
\[
\mathbf{\Lambda}\hat{\mathbf{u}} = \hat{\mathbf{f}}.
\]
%
where $\mathbf{\Lambda}$ represents an operator which projects the solution of
each face back onto the three-dimensional element or edge onto the
two-dimensional element. In this setting then, $\hat{\mathbf{f}}$ consists of
degrees of freedom for each egde (in 2D) or face (in 3D). The overall system
does not, therefore, results in a weaker coupling between degrees of freedom,
but at the expense of a larger matrix system.
\subsection{Preconditioners}
Within the \nekpp framework a number of preconditioners are available to speed
up the convergence rate of the conjugate gradient routine. The table below
summarises each method, the dimensions of elements which are supported, and also
the discretisation type support which can either be continuous (CG) or
discontinuous (hybridizable DG).
\begin{center}
\begin{tabular}{lll}
\toprule
\textbf{Name} & \textbf{Dimensions} & \textbf{Discretisations} \\
\midrule
\inltt{Null} & All & All \\
\inltt{Diagonal} & All & All \\
\inltt{FullLinearSpace} & 2/3D & CG \\
\inltt{LowEnergyBlock} & 3D & CG \\
\inltt{Block} & 2/3D & All \\
\midrule
\inltt{FullLinearSpaceWithDiagonal} & All & CG \\
\inltt{FullLinearSpaceWithLowEnergyBlock} & 2/3D & CG \\
\inltt{FullLinearSpaceWithBlock} & 2/3D & CG \\
\bottomrule
\end{tabular}
\end{center}
The default is the \inltt{Diagonal} preconditioner. The above preconditioners
are specified through the \inltt{Preconditioner} option of the
\inltt{SOLVERINFO} section in the session file. For example, to enable
\inltt{FullLinearSpace} one can use:
\begin{lstlisting}[style=XMLStyle]
<I PROPERTY="Preconditioner" VALUE="FullLinearSpace" />
\end{lstlisting}
Alternatively one can have more control over different preconditioners for each
solution field by using the \inltt{GlobalSysSoln} section. For more details,
consult the user guide. The following sections specify the details for each
method.
\subsubsection{Diagonal}
Diagonal (or Jacobi) preconditioning is amongst the simplest preconditioning
strategies. In this scheme one takes the global matrix $\mathbf{H} = (h_{ij})$
and computes the diagonal terms $h_{ii}$. The preconditioner is then formed as a
diagonal matrix $\mathbf{M}^{-1} = (h_{ii}^{-1})$.
\subsubsection{Linear space}
The linear space (or coarse space) of the matrix system is that containing
degrees of freedom corresponding only to the vertex modes in the high-order
system. Preconditioning of this space is achieved by forming the matrix
corresponding to the coarse space and inverting it, so that
%
\[
\mathbf{M}^{-1} = (\mathbf{S}^{-1}_{1})_{vv}
\]
%
Since the mesh associated with higher order methods is relatively coarse
compared with traditional finite element discretisations, the linear space can
usually be directly inverted without memory issues. However such a methodology
can be prohibitive on large parallel systems, due to a bottleneck in
communication.
In \nekpp the inversion of the linear space present is handled using the
$XX^{T}$ library. $XX^{T}$ is a parallel direct solver for problems of the form
$\mathbf{A}\hat{\mathbf{x}} = \hat{\mathbf{b}}$ based around a sparse
factorisation of the inverse of $\mathbf{A}$. To precondition utilising this
methodology the linear sub-space is gathered from the expansion and the
preconditioned residual within the CG routine is determined by solving
%
\[
(\mathbf{S}_{1})_{vv}\hat{\mathbf{z}}=\hat{\mathbf{r}}
\]
%
The preconditioned residual $\hat{\mathbf{z}}$ is then scattered back to the
respective location in the global degrees of freedom.
\subsubsection{Block}
Block preconditioning of the $C^0$ continuous system is defined by the
following:
%
\[
\mathbf{M}^{-1}=\left[ \begin{array}{ccc}
(\mathbf{S}^{-1}_{1})_{vv} & 0 & 0 \\
0 & (\mathbf{S}^{-1}_{1})_{eb} & 0\\
0 & 0 & (\mathbf{S}^{-1}_{1})_{ef}\\
\end{array} \right]
\]
%
where $\mathrm{diag}[(\mathbf{S}_{1})_{vv}]$ is the diagonal of the vertex
modes, $(\mathbf{S}_{1})_{eb}$ and $(\mathbf{S}_{1})_{fb}$ are block diagonal
matrices corresponding to coupling of an edge (or face) with itself i.e ignoring
the coupling to other edges and faces. This preconditioner is best suited for
two dimensional problems.
In the HDG system, we take the block corresponding to each face and invert
it. Each of these inverse blocks then forms one of the diagonal components of
the block matrix $\mathbf{M}^{-1}$.
\subsection{Low energy}
Low energy basis preconditioning follows the methodology proposed by Sherwin \&
Casarin. In this method a new basis is numerically constructed from the original
basis which allows the Schur complement matrix to be preconditioned using a
block preconditioner. The method is outlined briefly in the following.
Elementally the local approximation $\mathbf{u}^{\delta}$ can be expressed as
different expansions lying in the same discrete space $V^{\delta}$
%
\[
\mathbf{u}^{\delta}(\mathbf{x})=\sum_{i}^{\dim(V^{\delta})}\hat{u}_{1i}\phi_{1i}(x)
= \sum_{i}^{\dim(V^{\delta})}\hat{u}_{2i}\phi_{2j}(x)
\]
%
Since both expansions lie in the same space it's possible to express one basis
in terms of the other via a transformation, i.e.
%
\[
\phi_{2}=\mathbf{C}\phi_{1} \implies \hat{\mathbf{u}}_{1}=C^{T}\hat{\mathbf{u}}_{2}
\]
%
Applying this to the Helmholtz operator it is possible to show that,
%
\[
\mathbf{H}_{2}=\mathbf{C}\mathbf{H}_{1}\mathbf{C}^{T}
\]
%
For sub-structured matrices ($\mathbf{S}$) the transformation matrix
($\mathbf{C}$) becomes:
%
\[
\mathbf{C}=\left[ \begin{array}{cc}
\mathbf{R} & 0\\
0 & \mathbf{I}
\end{array} \right]
\]
%
Hence the transformation in terms of the Schur complement matrices is:
%
\[
\mathbf{S}_{2}=\mathbf{R}\mathbf{S}_{1}\mathbf{R}^{T}
\]
%
Typically the choice of expansion basis $\phi_{1}$ can lead to a Helmholtz
matrix that has undesirable properties i.e poor condition number. By choosing a
suitable transformation matrix $\mathbf{C}$ it is possible to construct a new
basis, numerically, that is amenable to block diagonal preconditioning.
%
\[
\mathbf{S}_{1}=\left[ \begin{array}{ccc}
\mathbf{S}_{vv} & \mathbf{S}_{ve} & \mathbf{S}_{vf}\\
\mathbf{S}^{T}_{ve}& \mathbf{S}_{ee} & \mathbf{S}_{ef} \\
\mathbf{S}^{T}_{vf} & \mathbf{S}^{T}_{ef} & \mathbf{S}_{ff} \end{array} \right] =\left[ \begin{array}{cc}
\mathbf{S}_{vv} & \mathbf{S}_{v,ef} \\
\mathbf{S}^{T}_{v,ef} & \mathbf{S}_{ef,ef} \end{array} \right]
\]
%
Applying the transformation
$\mathbf{S}_{2}=\mathbf{R} \mathbf{S}_{1} \mathbf{R}^{T}$ leads to the following
matrix
%
\[
\mathbf{S}_{2}=\left[ \begin{array}{cc}
\mathbf{S}_{vv}+\mathbf{R}_{v}\mathbf{S}^{T}_{v,ef}+\mathbf{S}_{v,ef}\mathbf{R}^{T}_{v}+\mathbf{R}_{v}\mathbf{S}_{ef,ef}\mathbf{R}^{T}_{v} & [\mathbf{S}_{v,ef}+\mathbf{R}_{v}\mathbf{S}_{ef,ef}]\mathbf{A}^{T} \\
\mathbf{A}[\mathbf{S}^{T}_{v,ef}+\mathbf{S}_{ef,ef}\mathbf{R}^{T}_{v}] & \mathbf{A}\mathbf{S}_{ef,ef}\mathbf{A}^{T} \end{array} \right]
\]
%
where $\mathbf{A}\mathbf{S}_{ef,ef}\mathbf{A}^{T}$ is given by
%
\[
\mathbf{A}\mathbf{S}_{ef,ef}\mathbf{A}^{T}=\left[ \begin{array}{cc}
\mathbf{S}_{ee}+\mathbf{R}_{ef}\mathbf{S}^{T}_{ef}+\mathbf{S}_{ef}\mathbf{R}^{T}_{ef}+\mathbf{R}_{ef}\mathbf{S}_{ff}\mathbf{R}^{T}_{ef} & \mathbf{S}_{ef}+\mathbf{R}_{ef}\mathbf{S}_{ff}\\
\mathbf{S}^{T}_{ef}+\mathbf{S}_{ff}\mathbf{R}^{T}_{ef} & \mathbf{S}_{ff}
\end{array} \right]
\]
%
To orthogonalise the vertex-edge and vertex-face modes, it can be seen from the
above that
%
\[
\mathbf{R}^{T}_{ef}=-\mathbf{S}^{-1}_{ff}\mathbf{S}^{T}_{ef}
\]
%
and for the edge-face modes:
%
\[
\mathbf{R}^{T}_{v}=-\mathbf{S}^{-1}_{ef,ef}\mathbf{S}^{T}_{v,ef}
\]
%
Here it is important to consider the form of the expansion basis since the
presence of $\mathbf{S}^{-1}_{ff}$ will lead to a new basis which has support on
all other faces; this is problematic when creating a $C^{0}$ continuous global
basis. To circumvent this problem when forming the new basis, the decoupling is
only performed between a specific edge and the two adjacent faces in a symmetric
standard region. Since the decoupling is performed in a rotationally symmetric
standard region the basis does not take into account the Jacobian mapping
between the local element and global coordinates, hence the final expansion will
not be completely orthogonal.
The low energy basis creates a Schur complement matrix that although it is not
completely orthogonal can be spectrally approximated by its block diagonal
contribution. The final form of the preconditioner is:
%
\[
\mathbf{M}^{-1}=\left[ \begin{array}{ccc}
\mathrm{diag}[(\mathbf{S}_{2})_{vv}] & 0 & 0 \\
0 & (\mathbf{S}_{2})_{eb} & 0\\
0 & 0 & (\mathbf{S}_{2})_{fb}\\
\end{array} \right]^{-1}
\]
%
where $\mathrm{diag}[(\mathbf{S}_{2})_{vv}]$ is the diagonal of the vertex
modes, $(\mathbf{S}_{2})_{eb}$ and $(\mathbf{S}_{2})_{fb}$ are block diagonal
matrices corresponding to coupling of an edge (or face) with itself i.e ignoring
the coupling to other edges and faces.
%%% DOCUMENTCLASS
%%%-------------------------------------------------------------------------------
\documentclass[
a4paper, % Stock and paper size.
11pt, % Type size.
% article,
% oneside,
onecolumn, % Only one column of text on a page.
% openright, % Each chapter will start on a recto page.
% openleft, % Each chapter will start on a verso page.
openany, % A chapter may start on either a recto or verso page.
]{memoir}
%%% PACKAGES
%%%------------------------------------------------------------------------------
\usepackage[utf8]{inputenc} % If utf8 encoding
% \usepackage[lantin1]{inputenc} % If not utf8 encoding, then this is probably the way to go
\usepackage[T1]{fontenc} %
\usepackage{lmodern}
\usepackage[english]{babel} % English please
\usepackage[final]{microtype} % Less badboxes
% \usepackage{kpfonts} %Font
\usepackage{amsmath,amssymb,mathtools} % Math
\usepackage{pifont}% http://ctan.org/pkg/pifont
\newcommand{\cmark}{\ding{51}}%
\newcommand{\xmark}{\ding{55}}%
\usepackage{graphicx} % Include figures
\usepackage{makeidx}
\usepackage{import}
%%% PAGE LAYOUT
%%%-----------------------------------------------------------------------------
\setlrmarginsandblock{0.15\paperwidth}{*}{1} % Left and right margin
\setulmarginsandblock{0.2\paperwidth}{*}{1} % Upper and lower margin
\checkandfixthelayout
\newlength\forceindent
\setlength{\forceindent}{\parindent}
\setlength{\parindent}{0cm}
\renewcommand{\indent}{\hspace*{\forceindent}}
\setlength{\parskip}{1em}
%%% SECTIONAL DIVISIONS
%%%------------------------------------------------------------------------------
\maxsecnumdepth{paragraph} % Subsections (and higher) are numbered
\setsecnumdepth{paragraph}
\makeatletter %
\makechapterstyle{standard}{
\setlength{\beforechapskip}{0\baselineskip}
\setlength{\midchapskip}{1\baselineskip}
\setlength{\afterchapskip}{8\baselineskip}
\renewcommand{\chapterheadstart}{\vspace*{\beforechapskip}}
\renewcommand{\chapnamefont}{\centering\normalfont\Large}
\renewcommand{\printchaptername}{\chapnamefont \@chapapp}
\renewcommand{\chapternamenum}{\space}
\renewcommand{\chapnumfont}{\normalfont\Large}
\renewcommand{\printchapternum}{\chapnumfont \thechapter}
\renewcommand{\afterchapternum}{\par\nobreak\vskip \midchapskip}
\renewcommand{\printchapternonum}{\vspace*{\midchapskip}\vspace*{5mm}}
\renewcommand{\chaptitlefont}{\centering\bfseries\LARGE}
% \renewcommand{\printchaptertitle}[1]{\chaptitlefont ##1}
\renewcommand{\afterchaptertitle}{\par\nobreak\vskip \afterchapskip}
}
\makeatother
%\chapterstyle{standard}
\chapterstyle{madsen}
\setsecheadstyle{\normalfont\large\bfseries}
\setsubsecheadstyle{\normalfont\normalsize\bfseries}
\setparaheadstyle{\normalfont\normalsize\bfseries}
\setparaindent{0pt}\setafterparaskip{0pt}
%%% FLOATS AND CAPTIONS
%%%------------------------------------------------------------------------------
\makeatletter % You do not need to write [htpb] all the time
\renewcommand\fps@figure{htbp} %
\renewcommand\fps@table{htbp} %
\makeatother %
\captiondelim{\space } % A space between caption name and text
\captionnamefont{\small\bfseries} % Font of the caption name
\captiontitlefont{\small\normalfont} % Font of the caption text
\changecaptionwidth % Change the width of the caption
\captionwidth{1\textwidth} %
\usepackage{tabularx}
%%% ABSTRACT
%%%------------------------------------------------------------------------------
\renewcommand{\abstractnamefont}{\normalfont\small\bfseries} % Font of abstract title
\setlength{\absleftindent}{0.1\textwidth} % Width of abstract
\setlength{\absrightindent}{\absleftindent}
%%% HEADER AND FOOTER
%%%------------------------------------------------------------------------------
\makepagestyle{standard} % Make standard pagestyle
\makeatletter % Define standard pagestyle
\makeevenfoot{standard}{}{}{} %
\makeoddfoot{standard}{}{}{} %
\makeevenhead{standard}{\bfseries\thepage\normalfont\qquad\small\leftmark}{}{}
\makeoddhead{standard}{}{}{\small\rightmark\qquad\bfseries\thepage}
% \makeheadrule{standard}{\textwidth}{\normalrulethickness}
\makeatother %
\makeatletter
\makepsmarks{standard}{
\createmark{chapter}{both}{shownumber}{\@chapapp\ }{ \quad }
\createmark{section}{right}{shownumber}{}{ \quad }
\createplainmark{toc}{both}{\contentsname}
\createplainmark{lof}{both}{\listfigurename}
\createplainmark{lot}{both}{\listtablename}
\createplainmark{bib}{both}{\bibname}
\createplainmark{index}{both}{\indexname}
\createplainmark{glossary}{both}{\glossaryname}
}
\makeatother %
\makepagestyle{chap} % Make new chapter pagestyle
\makeatletter
\makeevenfoot{chap}{}{\small\bfseries\thepage}{} % Define new chapter pagestyle
\makeoddfoot{chap}{}{\small\bfseries\thepage}{} %
\makeevenhead{chap}{}{}{} %
\makeoddhead{chap}{}{}{} %
% \makeheadrule{chap}{\textwidth}{\normalrulethickness}
\makeatother
\nouppercaseheads
\pagestyle{standard} % Choosing pagestyle and chapter pagestyle
\aliaspagestyle{chapter}{chap} %
%%% NEW COMMANDS
%%%-----------------------------------------------------------------------------
% Nektar++ version
\usepackage{xspace}
\newcommand{\nekver} {\input{VERSION}\unskip}
\newcommand{\p}{\partial} %Partial
% Or what ever you want
%%% CODE SNIPPETS, COMMANDS, ETC
%%%-----------------------------------------------------------------------------
\usepackage{xcolor}
\usepackage{listings} % Display code / shell commands
\usepackage{lstautogobble}
%\newcommand{\shellcommand}[1]{\begin{lstlisting} \#1 \end{lstlisting}
\lstdefinestyle{BashInputStyle}{
language=bash,
basicstyle=\small\sffamily,
% numbers=left,
% numberstyle=\tiny,
% numbersep=3pt,
frame=,
columns=fullflexible,
backgroundcolor=\color{black!05},
linewidth=0.9\linewidth,
xleftmargin=0.1\linewidth
}
\definecolor{gray}{rgb}{0.4,0.4,0.4}
\definecolor{darkblue}{rgb}{0.0,0.0,0.6}
\definecolor{cyan}{rgb}{0.0,0.6,0.6}
\definecolor{maroon}{rgb}{0.5,0.0,0.0}
\lstdefinelanguage{XML}
{
basicstyle=\ttfamily\footnotesize,
morestring=[b]",
moredelim=[s][\bfseries\color{maroon}]{<}{\ },
moredelim=[s][\bfseries\color{maroon}]{</}{>},
moredelim=[l][\bfseries\color{maroon}]{/>},
moredelim=[l][\bfseries\color{maroon}]{>},
morecomment=[s]{<?}{?>},
morecomment=[s]{<!--}{-->},
commentstyle=\color{gray},
stringstyle=\color{orange},
identifierstyle=\color{darkblue}
}
\lstdefinestyle{XMLStyle}{
language=XML,
basicstyle=\sffamily\footnotesize,
numbers=left,
numberstyle=\tiny,
numbersep=3pt,
frame=,
columns=fullflexible,
backgroundcolor=\color{black!05},
linewidth=0.9\linewidth,
xleftmargin=0.1\linewidth
}
\lstdefinestyle{C++Style}{
language=C++,
basicstyle=\sffamily\footnotesize,
numbers=left,
numberstyle=\tiny,
numbersep=3pt,
frame=,
columns=fullflexible,
backgroundcolor=\color{black!05},
linewidth=0.9\linewidth,
xleftmargin=0.1\linewidth,
showspaces=false,
showstringspaces=false
}
\usepackage{tikz}
\ifdefined\HCode
\newcommand{\inltt}[1]{\texttt{#1}}
\newcommand{\inlsh}[1]{\texttt{#1}}
\else
\newcommand{\inltt}[1]{\tikz[anchor=base,baseline]\node[inner sep=3pt,
rounded corners,outer sep=0,draw=black!30,fill=black!05]{\small\texttt{#1}};}
\newcommand{\inlsh}[1]{\tikz[anchor=base,baseline]\node[inner sep=2pt,
outer sep=0,fill=black!05]{\texttt{#1}};}
\fi
\newcommand{\nekpp}{{\em Nektar++}\xspace}
% Highlight box
\usepackage{environ}
\usepackage[tikz]{bclogo}
\usetikzlibrary{calc}
% Only use fancy boxes for PDF
\ifdefined\HCode
\NewEnviron{notebox}{\textbf{Note:} \BODY}
\NewEnviron{warningbox}{\textbf{Warning:} \BODY}
\NewEnviron{tipbox}{\textbf{Tip:} \BODY}
\NewEnviron{custombox}[3]{\textbf{#1} \BODY}
\else
\NewEnviron{notebox}
{\par\medskip\noindent
\begin{tikzpicture}
\node[inner sep=5pt,fill=black!10,draw=black!30] (box)
{\parbox[t]{.99\linewidth}{%
\begin{minipage}{.1\linewidth}
\centering\tikz[scale=1]\node[scale=1.5]{\bcinfo};
\end{minipage}%
\begin{minipage}{.9\linewidth}
\textbf{Note}\par\smallskip
\BODY
\end{minipage}\hfill}%
};
\end{tikzpicture}\par\medskip%
}
\NewEnviron{warningbox}
{\par\medskip\noindent
\begin{tikzpicture}
\node[inner sep=5pt,fill=red!10,draw=black!30] (box)
{\parbox[t]{.99\linewidth}{%
\begin{minipage}{.1\linewidth}
\centering\tikz[scale=1]\node[scale=1.5]{\bcdanger};
\end{minipage}%
\begin{minipage}{.9\linewidth}
\textbf{Warning}\par\smallskip
\BODY
\end{minipage}\hfill}%
};
\end{tikzpicture}\par\medskip%
}
\NewEnviron{tipbox}
{\par\medskip\noindent
\begin{tikzpicture}
\node[inner sep=5pt,fill=green!10,draw=black!30] (box)
{\parbox[t]{.99\linewidth}{%
\begin{minipage}{.1\linewidth}
\centering\tikz[scale=1]\node[scale=1.5]{\bclampe};
\end{minipage}%
\begin{minipage}{.9\linewidth}
\textbf{Tip}\par\smallskip
\BODY
\end{minipage}\hfill}%
};
\end{tikzpicture}\par\medskip%
}
\NewEnviron{custombox}[3]
{\par\medskip\noindent
\begin{tikzpicture}
\node[inner sep=5pt,fill=#3!10,draw=black!30] (box)
{\parbox[t]{.99\linewidth}{%
\begin{minipage}{.1\linewidth}
\centering\tikz[scale=1]\node[scale=1.5]{#2};
\end{minipage}%
\begin{minipage}{.9\linewidth}
\textbf{#1}\par\smallskip
\BODY
\end{minipage}\hfill}%
};
\end{tikzpicture}\par\medskip%
}
\fi
%%% TABLE OF CONTENTS AND INDEX
%%%-----------------------------------------------------------------------------
\maxtocdepth{subsection} % Only parts, chapters and sections in the table of contents
\settocdepth{subsection}
\makeindex
%\AtEndDocument{\addtocontents{toc}{\par}} % Add a \par to the end of the TOC
%%% INTERNAL HYPERLINKS
%%%-----------------------------------------------------------------------------
\usepackage[linktoc=all,hyperfootnotes=false]{hyperref} % Internal hyperlinks
\hypersetup{
colorlinks,
citecolor=darkblue,
filecolor=darkblue,
linkcolor=darkblue,
urlcolor=darkblue,
pdfborder={0 0 0}, % No borders around internal hyperlinks
pdfauthor={I am the Author} % author
}
\usepackage{memhfixc} %
%%% PRETTY TITLE PAGE FOR PDF DOC
%%%-----------------------------------------------------------------------------
\makeatletter
\newlength\drop
\newcommand{\br}{\hfill\break}
\newcommand*{\titlepage}{%
\thispagestyle{empty}
\begingroup% Gentle Madness
\drop = 0.1\textheight
\vspace*{\baselineskip}
\vfill
\hbox{%
\hspace*{0.1\textwidth}%
\rule{1pt}{\dimexpr\textheight-28pt\relax}%
\hspace*{0.05\textwidth}%
\parbox[b]{0.85\textwidth}{%
\vbox{%
{\includegraphics[width=0.2\textwidth]{icon-blue.png}\par}
\vskip1.00\baselineskip
{\Huge\bfseries\raggedright\@title\par}
\vskip2.37\baselineskip
{\huge\bfseries Version \nekver\par}
\vskip4\baselineskip
{\huge\bfseries \textcolor{darkblue}{Developer Guide}\par}
\vskip1.0\baselineskip
{\large\bfseries\@date\par}
\vspace{0.3\textheight}
{\small\noindent\@author}\\[\baselineskip]
}% end of vbox
}% end of parbox
}% end of hbox
\vfill
\null
\endgroup}
\makeatother
%%% THE DOCUMENT
%%% Where all the important stuff is included!
%%%-------------------------------------------------------------------------------
\author{Department of Aeronautics, Imperial College London, UK\newline
Scientific Computing and Imaging Institute, University of Utah, USA}
\title{Nektar++: Spectral/hp Element Framework}
\date{\today}
\begin{document}
\frontmatter
% Render pretty title page if not building HTML
\ifdefined\HCode
\begin{center}
\includegraphics[width=0.1\textwidth]{img/icon-blue.png}
\end{center}
\maketitle
\begin{center}
\huge{Developers Guide - Version \nekver}
\end{center}
\else
\titlepage
\fi
\clearpage
\ifx\HCode\undefined
\tableofcontents*
\fi
\clearpage
\chapter{Introduction}
Nektar++ \cite{CaMoCoBoRo15} is a tensor product based finite element package
designed to allow one to construct efficient classical low polynomial order
$h$-type solvers (where $h$ is the size of the finite element) as well as higher
$p$-order piecewise polynomial order solvers. The framework currently has the
following capabilities:
\begin{itemize}
\item Representation of one, two and three-dimensional fields as a collection of
piecewise continuous or discontinuous polynomial domains.
\item Segment, plane and volume domains are permissible, as well as domains
representing curves and surfaces (dimensionally-embedded domains).
\item Hybrid shaped elements, i.e triangles and quadrilaterals or tetrahedra,
prisms and hexahedra.
\item Both hierarchical and nodal expansion bases.
\item Continuous or discontinuous Galerkin operators.
\item Cross platform support for Linux, Mac OS X and Windows.
\end{itemize}
The framework comes with a number of solvers and also allows one to construct a
variety of new solvers.
Our current goals are to develop:
\begin{itemize}
\item Automatic auto-tuning of optimal operator implementations based upon not
only $h$ and $p$ but also hardware considerations and mesh connectivity.
\item Temporal and spatial adaption.
\item Features enabling evaluation of high-order meshing techniques.
\end{itemize}
This document provides implementation details for the design of the libraries,
Nektar++-specific data structures and algorithms and other development
information.
\begin{warningbox}
This document is still under development and may be incomplete in parts.
\end{warningbox}
For further information and to download the software, visit the Nektar++ website
at \url{http://www.nektar.info}.
\mainmatter
\import{core-concepts/}{core-concepts.tex}
\import{library-design/}{library-design.tex}
\import{data-structures-algorithms/}{data-structures-algorithms.tex}
\import{coding-standard/}{coding-standard.tex}
%\appendix
\backmatter
%%% BIBLIOGRAPHY
%%% -------------------------------------------------------------
\bibliographystyle{plain}
\bibliography{../refs}
\printindex
\end{document}
docs/developer-guide/img/icon-blue.png

8.87 KiB

docs/developer-guide/library-design/img/LocalRegions.png

21.4 KiB

docs/developer-guide/library-design/img/MultiRegions.png

27.9 KiB

docs/developer-guide/library-design/img/Quasi3d.png

38 KiB

docs/developer-guide/library-design/img/SpatialDomains.png

23.9 KiB

docs/developer-guide/library-design/img/StdRegions.png

27.7 KiB

This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment