Commit 1e3400ce authored by Chris Cantwell's avatar Chris Cantwell

Merge branch 'master' into feature/collections

parents d35cb1e4 5b67aa99
......@@ -156,8 +156,9 @@ used within the code. We then initialise it in \inlsh{MyConceptImpl1.cpp}
\begin{lstlisting}[style=C++Style]
string MyConceptImpl1::className
= GetMyConceptFactory().RegisterCreatorFunction(
"Impl1", MyConceptImpl1::create, "First implementation of my
concept.");
"Impl1",
MyConceptImpl1::create,
"First implementation of my concept.");
\end{lstlisting}
The first parameter specifies the value of the key which should be used to
select this implementation. The second parameter is a function pointer to our
......@@ -239,3 +240,115 @@ a significant performance penalty (such as in tight inner loops). If needed,
Arrays allow access to the C-style array through the \texttt{Array::data} member
function.
\section{Threading}
\begin{notebox}
Threading is not currently included in the main code distribution. However, this
hybrid MPI/pthread functionality should be available within the next few months.
\end{notebox}
We investigated adding threaded parallelism to the already MPI parallel
Nektar++. MPI parallelism has multiple processes that exchange data using
network or network-like communications. Each process retains its own memory
space and cannot affect any other process’s memory space except through the MPI
API. A thread, on the other hand, is a separately scheduled set of instructions
that still resides within a single process’s memory space. Therefore threads
can communicate with one another simply by directly altering the process’s
memory space. The project's goal was to attempt to utilise this difference to
speed up communications in parallel code.
A design decision was made to add threading in an implementation independent
fashion. This was achieved by using the standard factory methods which
instantiate an abstract thread manager, which is then implemented by a concrete
class. For the reference implementation it was decided to use the Boost library
rather than native p-threads because Nektar++ already depends on the Boost
libraries, and Boost implements threading in terms of p-threads anyway.
It was decided that the best approach would be to use a thread pool. This
resulted in the abstract classes ThreadManager and ThreadJob. ThreadManager is
a singleton class and provides an interface for the Nektar++ programmer to
start, control, and interact with threads. ThreadJob has only one method, the
virtual method run(). Subclasses of ThreadJob must override run() and provide a
suitable constructor. Instances of these subclasses are then handed to the
ThreadManager which dispatches them to the running threads. Many thousands of
ThreadJobs may be queued up with the ThreadManager and strategies may be
selected by which the running threads take jobs from the queue. Synchronisation
methods are also provided within the ThreadManager such as wait(), which waits
for the thread queue to become empty, and hold(), which pauses a thread that
calls it until all the threads have called hold(). The API was thoroughly
documented in Nektar++’s existing Javadoc style.
Classes were then written for a concrete implementation of ThreadManager using
the Boost library. Boost has the advantage of being available on all Nektar++’s
supported platforms. It would not be difficult, however, to implement
ThreadManager using some other functionality, such as native p-threads.
Two approaches to utilising these thread classes were then investigated. The
bottom-up approach identifies likely regions of the code for parallelisation,
usually loops around a simple and independent operation. The top-down approach
seeks to run as much of the code as is possible within a threaded environment.
The former approach was investigated first due to its ease of implementation.
The operation chosen was the multiplication of a very large sparse block
diagonal matrix with a vector, where the matrix is stored as its many smaller
sub matrices. The original algorithm iterated over the sub matrices multiplying
each by the vector and accumulating the result. The new parallel algorithm
sends ThreadJobs consisting of batches of sub matrices to the thread pool. The
worker threads pick up the ThreadJobs and iterate over the sub matrices in the
job accumulating the result in a thread specific result vector. This latter
detail helps to avoid the problem of cache ping-pong which is where multiple
threads try to write to the same memory location, repeatedly invalidating one
another's caches.
Clearly this approach will work best when the sub matrices are large and there
are many of them . However, even for test cases that would be considered large
it became clear that the code was still spending too much time in its scalar
regions.
This led to the investigation of the top-down approach. Here the intent is to
run as much of the code as possible in multiple threads. This is a much more
complicated approach as it requires that the overall problem can be partitioned
suitably, that a mechanism be available to exchange data between the threads,
and that any code using shared resources be thread safe. As Nektar++ already
has MPI parallelism the first two requirements (data partitioning and exchange)
are already largely met. However since MPI parallelism is implemented by having
multiple independent processes that do not share memory space, global data in
the Nektar++ code, such as class static members or singleton instances, are now
vulnerable to change by all the threads running in a process.
To Nektar++’s communication class, Comm, was added a new class, ThreadedComm.
This class encapsulates a Comm object and provides extra functionality without
altering the API of Comm (this is the Decorator pattern). To the rest of the
Nektar++ library this Comm object behaves the same whether it is a purely MPI
Comm object or a hybrid threading plus MPI object. The existing data
partitioning code can be used with very little modification and the parts of the
Nektar++ library that exchange data are unchanged. When a call is made to
exchange data with other workers ThreadedComm first has the master thread on
each process (i.e. the first thread) use the encapsulated Comm object (typically
an MPI object) to exchange the necessary data between the other processes, and
then exchanges data with the local threads using direct memory to memory copies.
As an example: take the situation where there are two processes A and B,
possibly running on different computers, each with two threads 1 and 2. A
typical data exchange in Nektar++ uses the Comm method AllToAll(...) in which
each worker sends data to each of the other workers. Thread A1 will send data
from itself and thread A2 via the embedded MPI Comm to thread B1, receiving in
turn data from threads B1 and B2. Each thread will then pick up the data it
needs from the master thread on its process using direct memory to memory
copies. Compared to the situation where there are four MPI processes the number
of communications that actually pass over the network is reduced. Even MPI
implementations that are clever enough to recognise when processes are on the
same host must make a system call to transfer data between processes.
The code was then audited for situations where threads would be attempting to
modify global data. Where possible such situations were refactored so that each
thread has a copy of the global data. Where the original design of Nektar++ did
not permit this access to global data was mediated through locking and
synchronisation. This latter approach is not favoured except for global data
that is used infrequently because locking reduces concurrency.
The code has been tested and Imperial College cluster cx1 and has shown good
scaling. However it is not yet clear that the threading approach outperforms
the MPI approach; it is possible that the speedups gained through avoiding
network operations are lost due to locking and synchronisation issues. These
losses could be mitigated through more in-depth refactoring of Nektar++.
\ No newline at end of file
......@@ -126,7 +126,7 @@ Both the local coordinate axis along an intersecting edge will then point in the
same direction. Obviously, these conditions will not be fulfilled by default.
But in order to do so, the direction of the local coordinate axis should be
reversed in following situations:
\begin{lstlisting}
\begin{lstlisting}[style=C++Style]
if ((LocalEdgeId == 0)||(LocalEdgeId == 1)) {
if( EdgeOrientation == Backward ) {
change orientation of local coordinate axis
......@@ -184,7 +184,7 @@ mode within element e. This index i in this map array corresponds to the index o
\item globalID represents the ID of the corresponding global degree of freedom.
\end{itemize}
However, rather than this two-dimensional structure of the mapping array,
However, rather than this two-dimensional structure of the mapping array,\\
\texttt{LocalToGlobalMap2D::m\_locToContMap} stores the mapping array as a
one-dimensional array which is the concatenation of the different elemental
mapping arrays map[e]. This mapping array can then be used to assemble the
......@@ -215,7 +215,7 @@ the vertex mode, we will make a clear distinction between them in the
The fill-in of the mapping array can than be summarised by the following part of
(simplified) code:
\begin{lstlisting}
\begin{lstlisting}[style=C++Style]
for(e = 0; e < Number_Of_2D_Elements; e++) {
for(i = 0; i < Number_Of_Vertices_Of_Element_e; i++) {
offsetValue = ...
......
......@@ -215,8 +215,8 @@ switch between some (depending on the problem) of the following
time-integration schemes:
\begin{center}
\small
\begin{tabular}{ll}
\footnotesize
\begin{tabular}{p{4cm}p{10cm}}
\toprule
AdamsBashforthOrder1 & Adams-Bashforth Forward multi-step scheme of order 1\\
AdamsBashforthOrder2 & Adams-Bashforth Forward multi-step scheme of order 2\\
......
@misc{nektar-website,
title={Nektar++: Spectral/hp element framework},
url={http://www.nektar.info},
year={2014}
}
@book{KaSh05,
title={Spectral/hp Element Methods for Computational Fluid Dynamics},
author={Karniadakis, G. E. and Sherwin, S. J.},
publisher={Oxford Science Publications},
year={2005}
}
@article{Bu06,
title={General linear methods},
author={Butcher, J. C.},
journal={Acta Numerica},
volume={15},
pages={157-256}
}
@article{VoEsBoChKi11,
title={A generic framework for time-stepping partial differential equations (PDEs):
general linear methods, object-oriented implementation and application to fluid
problems},
author={Vos, P. E. J. and Eskilsson, C. and Bolis, A. and Chun, S. and Kirby, R. M.
and Sherwin, S. J.},
journal={International Journal of Computational Fluid Dynamics},
volume={25},
issue={3},
pages={107-125},
year={2011}
}
......@@ -31,6 +31,7 @@ openany, % A chapter may start on either a recto or verso page.
\usepackage{makeidx}
\usepackage{import}
%%% PAGE LAYOUT
%%%-----------------------------------------------------------------------------
\setlrmarginsandblock{0.15\paperwidth}{*}{1} % Left and right margin
......@@ -207,7 +208,9 @@ openany, % A chapter may start on either a recto or verso page.
columns=fullflexible,
backgroundcolor=\color{black!05},
linewidth=0.9\linewidth,
xleftmargin=0.1\linewidth
xleftmargin=0.1\linewidth,
showspaces=false,
showstringspaces=false
}
\usepackage{tikz}
......@@ -371,7 +374,42 @@ Scientific Computing and Imaging Institute, University of Utah, USA}
\clearpage
\chapter{Introduction}
Welcome to the developer's guide for Nektar++\cite{nektar-website}.
Nektar++ \cite{CaMoCoBoRo15} is a tensor product based finite element package
designed to allow one to construct efficient classical low polynomial order
$h$-type solvers (where $h$ is the size of the finite element) as well as higher
$p$-order piecewise polynomial order solvers. The framework currently has the
following capabilities:
\begin{itemize}
\item Representation of one, two and three-dimensional fields as a collection of
piecewise continuous or discontinuous polynomial domains.
\item Segment, plane and volume domains are permissible, as well as domains
representing curves and surfaces (dimensionally-embedded domains).
\item Hybrid shaped elements, i.e triangles and quadrilaterals or tetrahedra,
prisms and hexahedra.
\item Both hierarchical and nodal expansion bases.
\item Continuous or discontinuous Galerkin operators.
\item Cross platform support for Linux, Mac OS X and Windows.
\end{itemize}
The framework comes with a number of solvers and also allows one to construct a
variety of new solvers.
Our current goals are to develop:
\begin{itemize}
\item Automatic auto-tuning of optimal operator implementations based upon not
only $h$ and $p$ but also hardware considerations and mesh connectivity.
\item Temporal and spatial adaption.
\item Features enabling evaluation of high-order meshing techniques.
\end{itemize}
This document provides implementation details for the design of the libraries,
Nektar++-specific data structures and algorithms and other development
information.
\begin{warningbox}
This document is still under development and may be incomplete in parts.
\end{warningbox}
\mainmatter
......@@ -392,7 +430,7 @@ Welcome to the developer's guide for Nektar++\cite{nektar-website}.
%%% -------------------------------------------------------------
\bibliographystyle{plain}
\bibliography{developer-guide}
\bibliography{../refs}
\printindex
......
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -268,6 +268,71 @@ in the human arterial system},
publisher={Cambridge Univ Press}
}
@book{KaSh05,
title={Spectral/hp Element Methods for Computational Fluid Dynamics},
author={Karniadakis, G. E. and Sherwin, S. J.},
publisher={Oxford Science Publications},
year={2005}
}
@article{Bu06,
title={General linear methods},
author={Butcher, J. C.},
journal={Acta Numerica},
volume={15},
pages={157-256},
year={2006}
}
@article{VoEsBoChKi11,
title={A generic framework for time-stepping partial differential equations (PDEs):
general linear methods, object-oriented implementation and application to fluid
problems},
author={Vos, P. E. J. and Eskilsson, C. and Bolis, A. and Chun, S. and Kirby, R. M.
and Sherwin, S. J.},
journal={International Journal of Computational Fluid Dynamics},
volume={25},
issue={3},
pages={107-125},
year={2011}
}
@article{CaMoCoBoRo15,
title={Nektar++: An open-source spectral/hp element framework},
author={Cantwell, CD and Moxey, D and Comerford, A and Bolis, A and Rocco, G and Mengaldo, G and De Grazia, D and Yakovlev, S and Lombard, J-E and Ekelschot, D and others},
journal={Computer Physics Communications},
year={2015},
publisher={Elsevier}
}
@book{Ga39,
title={Orthogonal polynomials},
author={Gabor Szeg{\"o}},
volume={23},
year={1939},
publisher={American Mathematical Soc.}
}
@book{AbSt64,
title={Handbook of mathematical functions},
author={Abramowitz, Milton and Stegun, Irene A},
year={1972},
publisher={Dover}
}
@techreport{CaHuYoQu88,
title={Spectral methods in fluid dynamics},
author={Canuto, Claudio and Hussaini, M Yousuff and Quarteroni, Alfio and Zang, Thomas A},
year={1988},
institution={Springer}
}
@article{GhOs70,
title={Quadrature formulae},
author={Ghizzetti, Alessandro and Ossicini, Aldo},
year={1970},
publisher={Birkh{\"a}user}
}
@article{DoKa14,
title={A robust and accurate outflow boundary condition for incompressible flow
simulations on severely-truncated unbounded domains},
......
\chapter{Introduction}
Nektar++ is a tensor product based finite element package designed to allow one
to construct efficient classical low polynomial order $h$-type solvers (where
$h$ is the size of the finite element) as well as higher $p$-order piecewise
polynomial order solvers. The framework currently has the following
capabilities:
Nektar++ \cite{CaMoCoBoRo15} is a tensor product based finite element package
designed to allow one to construct efficient classical low polynomial order
$h$-type solvers (where $h$ is the size of the finite element) as well as higher
$p$-order piecewise polynomial order solvers. The framework currently has the
following capabilities:
\begin{itemize}
\item Representation of one, two and three-dimensional fields as a collection of
......
......@@ -421,7 +421,7 @@ Scientific Computing and Imaging Institute, University of Utah, USA}
%%% -------------------------------------------------------------
\bibliographystyle{plain}
\bibliography{user-guide}
\bibliography{../refs}
\printindex
......
This diff is collapsed.
......@@ -203,6 +203,42 @@ which produces a field file \inlsh{threshold\_max.fld}.
Performs the same function as the \inltt{ThresholdMax} filter but records the
time at which the threshold variable drops below a prescribed value.
\subsection{One-dimensional energy}
This filter is designed to output the energy spectrum of one-dimensional
elements. It transforms the solution field at each timestep into a orthogonal
basis defined by the functions
\[
\psi_p(\xi) = L_p(\xi)
\]
where $L_p$ is the $p$-th Legendre polynomial. This can be used to show the
presence of, for example, oscillations in the underlying field due to numerical
instability. The resulting output is written into a file called
\inltt{session.eny} by default. The following parameters are supported:
\begin{center}
\begin{tabularx}{0.99\textwidth}{lllX}
\toprule
\textbf{Option name} & \textbf{Required} & \textbf{Default} &
\textbf{Description} \\
\midrule
\inltt{OutputFile} & \xmark & \inltt{session} &
Prefix of the output filename to which the energy spectrum is written.\\
\inltt{OutputFrequency} & \xmark & 1 &
Number of timesteps after which output is written.\\
\bottomrule
\end{tabularx}
\end{center}
An example syntax is given below:
\begin{lstlisting}[style=XMLStyle,gobble=2]
<FILTER TYPE="Energy1D">
<PARAM NAME="OutputFile">EnergyFile</PARAM>
<PARAM NAME="OutputFrequency">10</PARAM>
</FILTER>
\end{lstlisting}
\subsection{Modal energy}
\begin{notebox}
......
......@@ -22,6 +22,7 @@ SET(SOLVER_UTILS_SOURCES
Filters/FilterAeroForces.cpp
Filters/FilterAverageFields.cpp
Filters/FilterCheckpoint.cpp
Filters/FilterEnergy1D.cpp
Filters/FilterEnergyBase.cpp
Filters/FilterHistoryPoints.cpp
Filters/FilterModalEnergy.cpp
......@@ -63,6 +64,7 @@ SET(SOLVER_UTILS_HEADERS
Filters/FilterAeroForces.h
Filters/FilterAverageFields.h
Filters/FilterCheckpoint.h
Filters/FilterEnergy1D.h
Filters/FilterEnergyBase.h
Filters/FilterHistoryPoints.h
Filters/FilterModalEnergy.h
......
///////////////////////////////////////////////////////////////////////////////
//
// File FilterEnergy1D.cpp
//
// For more information, please see: http://www.nektar.info
//
// The MIT License
//
// Copyright (c) 2006 Division of Applied Mathematics, Brown University (USA),
// Department of Aeronautics, Imperial College London (UK), and Scientific
// Computing and Imaging Institute, University of Utah (USA).
//
// License for the specific language governing rights and limitations under
// Permission is hereby granted, free of charge, to any person obtaining a
// copy of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the
// Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
// OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
// THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
// FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
// DEALINGS IN THE SOFTWARE.
//
// Description: Outputs orthogonal expansion of 1D elements.
//
///////////////////////////////////////////////////////////////////////////////
#include <LibUtilities/Foundations/InterpCoeff.h>
#include <SolverUtils/Filters/FilterEnergy1D.h>
namespace Nektar
{
namespace SolverUtils
{
std::string FilterEnergy1D::className = GetFilterFactory().
RegisterCreatorFunction("Energy1D", FilterEnergy1D::create);
/**
* @brief Set up filter with output file and frequency parameters.
*
* @param pSession Current session.
* @param pParams Map of parameters defined in XML file.
*/
FilterEnergy1D::FilterEnergy1D(
const LibUtilities::SessionReaderSharedPtr &pSession,
const std::map<std::string, std::string> &pParams) :
Filter(pSession),
m_index(0)
{
std::string outName;
if (pParams.find("OutputFile") == pParams.end())
{
outName = m_session->GetSessionName();
}
else
{
ASSERTL0(!(pParams.find("OutputFile")->second.empty()),
"Missing parameter 'OutputFile'.");
outName = pParams.find("OutputFile")->second;
}
if (pParams.find("OutputFrequency") == pParams.end())
{
m_outputFrequency = 1;
}
else
{
m_outputFrequency =
atoi(pParams.find("OutputFrequency")->second.c_str());
}
outName += ".eny";
ASSERTL0(pSession->GetComm()->GetSize() == 1,
"The 1D energy filter currently only works in serial.");
m_out.open(outName.c_str());
}
/**
* @brief Destructor.
*/
FilterEnergy1D::~FilterEnergy1D()
{
}
/**
* @brief Initialize filter.
*/
void FilterEnergy1D::v_Initialise(
const Array<OneD, const MultiRegions::ExpListSharedPtr> &pFields,
const NekDouble &time)
{
ASSERTL0(pFields[0]->GetExp(0)->GetNumBases() == 1,
"The Energy 1D filter is only valid in 1D.");
}
/**
* @brief Update filter output with the current timestep's orthogonal
* coefficients.
*/
void FilterEnergy1D::v_Update(
const Array<OneD, const MultiRegions::ExpListSharedPtr> &pFields,
const NekDouble &time)
{
// Only output every m_outputFrequency
if ((m_index++) % m_outputFrequency)
{
return;
}
int nElmt = pFields[0]->GetExpSize();
// Loop over all elements
m_out << "##" << endl;
m_out << "## Time = " << time << endl;
m_out << "##" << endl;
for (int i = 0; i < nElmt; ++i)
{
// Figure out number of modes in this expansion.
LocalRegions::ExpansionSharedPtr exp = pFields[0]->GetExp(i);
int nModes = exp->GetBasis(0)->GetNumModes();
// Set uo basis key for orthogonal basis
LibUtilities::BasisType btype = LibUtilities::eOrtho_A;
LibUtilities::BasisKey bkeyOrth(
btype, nModes, exp->GetBasis(0)->GetPointsKey());
// Get basis key for existing expansion
LibUtilities::BasisKey bkey(
exp->GetBasis(0)->GetBasisType(),
exp->GetBasis(0)->GetNumModes(),
exp->GetBasis(0)->GetPointsKey());
// Find coeffs for this element in the list of all coefficients
Array<OneD, NekDouble> coeffs =
pFields[0]->GetCoeffs() + pFields[0]->GetCoeff_Offset(i);
// Storage for orthogonal coefficients
Array<OneD, NekDouble> coeffsOrth(nModes);
// Project from coeffs -> orthogonal coeffs
LibUtilities::InterpCoeff1D(bkey, coeffs, bkeyOrth, coeffsOrth);
// Write coeffs to file
m_out << "# Element " << i << " (ID "
<< exp->GetGeom()->GetGlobalID() << ")" << endl;
for (int j = 0; j < nModes; ++j)
{
m_out << coeffsOrth[j] << endl;
}
}
m_out << endl;
}
void FilterEnergy1D::v_Finalise(
const Array<OneD, const MultiRegions::ExpListSharedPtr> &pFields,
const NekDouble &time)
{
m_out.close();
}
bool FilterEnergy1D::v_IsTimeDependent()
{
return true;
}
}
}
///////////////////////////////////////////////////////////////////////////////
//
// File FilterEnergy1D.h
//
// For more information, please see: http://www.nektar.info
//
// The MIT License
//
// Copyright (c) 2006 Division of Applied Mathematics, Brown University (USA),
// Department of Aeronautics, Imperial College London (UK), and Scientific
// Computing and Imaging Institute, University of Utah (USA).
//
// License for the specific language governing rights and limitations under
// Permission is hereby granted, free of charge, to any person obtaining a
// copy of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the
// Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
// OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
// THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER