Skip to content
Snippets Groups Projects

Draft: Execution model

Closed James Edgeley requested to merge execution-model into master
3 unresolved threads

Operator Checklist

Operator Std(Mat) SumFac MatFree Cuda
BwdTrans
PhysDeriv
IProductWRTBase
IProductWRTDeriv
Helmholtz
Identity
Mass
AssmbScatr
ConjGrad
FwdTrans
HelmSolve
NullPrecon
DiagPrecon
RobBndCond
NeuBndCond
DirBndCond
Matrix

Adds execution model

ExecutionModel

Operators are templated on an execution model which determines the backend for the dependency graph which executes them.

The Cuda model wraps the cudaGraph functionality (requires cuda 10+)

The Thread model wraps a Chase-Lev work stealing queue using std::jthreads (requires C++20)

The Serial model wraps a std::queue

The Cuda execution model has a host sub-model which can be either Thread or Serial, allowing operations to occur on device and host simultaneously

Operators consist of a team of implementations, each of which acts on a block/chunk of elements. Each implementation has a node in the relevant graph - nodes are executed as soon as their parents are complete. An operator object can use different implementations on different blocks/chunks - or they can all be the same. Applying an Operator applies all its constituent implementations - applying an implementation submits its kernel to the graph. Launching the graph executes the nodes according to the dependency structure.

Operators and Implementations are registered by inheriting from a CRTP registrar class which registers their create methods in a factory. This allows the implementations access to the relevant operator scope (e.g. for its argument types)

template <ExecutionModel EM, typename TData>
class ExampleOperator final
    : public Operator<EM, TData>::Registrar<ExampleOperator<EM, TData>>
{
public:
//Traits of operator
    static constexpr OperatorType type     = OperatorType::ExampleOperator;
    static constexpr std::string_view name = "ExampleOperator";
    static constexpr std::string_view desc =
        "Example of an Operator";

//Argument types
    typedef Cochain<TData, FieldState::Coeff> ArgIn;
    typedef Cochain<TData, FieldState::Phys> ArgOut;

//Base for implementation
    using BlockOp = BlockOperator<EM, TData, ArgIn, ArgOut>;

    BwdTrans(GraphDetail<EM> &graph, Chain<TData> &chain)
        : Operator<EM, TData>::template Registrar<BwdTrans<EM, TData>>(graph,
                                                                       chain){};
//User functions
    static std::unique_ptr<BwdTrans<EM, TData>> Create(GraphDetail<EM> &G,
                                                       Chain<TData> &C)
    {
        return Operator<EM, TData>::template Create<BwdTrans<EM, TData>>(G, C);
    }


    void Apply(ArgIn &I, ArgOut &O)
    {
        return Operator<EM, TData>::template Apply<BwdTrans<EM, TData>>(I, O);
    }

//Implementation list
    enum class ImplType
    {
        StdMat
    };

    static constexpr ImplType default_impl = ImplType::StdMat;
    using Memory_t                         = MemoryRegion<TData>;

//Implementation definitions
    class StdMat final
        : public ExampleOperator<EM, TData>::template RegistrarImpl<StdMat>
    {
        using Base_t = ExampleOperator<EM, TData>::template RegistrarImpl<StdMat>;
        using EMH    = EM::HostExecution;

    public:
        static constexpr ImplType key          = ImplType::StdMat;
        static constexpr std::string_view name = "StdMat";
        static constexpr std::string_view desc =
            "Uses the standard matrix construction";

        StdMat(GraphDetail<EM> &graph, ChainBlock<TData> &block,
               HostDevice<EMH> &device);


        static std::unique_ptr<BlockOp> create(GraphDetail<EM> &graph,
                                               ChainBlock<TData> &block,
                                               Device &device);
        void apply(ArgIn &in, ArgOut &out) override final;

    private:
        HostDevice<EMH> &m_device;
        HostMemory &m_resource = DefaultMemory;
        std::unique_ptr<Memory_t> m_basis;
    };

};
Edited by James Edgeley

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
60 void SendRecv(const void *sendbuf, int sendcount, CommDataType sendtype,
61 int dest, void *recvbuf, int recvcount, CommDataType recvtype,
62 int source);
63 void AllReduce(void *buf, int count, CommDataType dt,
64 enum ReduceOperator pOp);
65 void AlltoAll(const void *sendbuf, int sendcount, CommDataType sendtype,
66 void *recvbuf, int recvcount, CommDataType recvtype);
67 void AlltoAllv(const void *sendbuf, int sendcounts[], int sdispls[],
68 CommDataType sendtype, void *recvbuf, int recvcounts[],
69 int rdispls[], CommDataType recvtype);
70 void AllGather(const void *sendbuf, int sendcount, CommDataType sendtype,
71 void *recvbuf, int recvcount, CommDataType recvtype);
72 void AllGatherv(const void *sendbuf, int sendcount, CommDataType sendtype,
73 void *recvbuf, int recvcounts[], int rdispls[],
74 CommDataType recvtype);
75 void AllGatherv(void *recvbuf, int recvcounts[], int rdispls[],
  • BOYANG XIA added 1 commit

    added 1 commit

    • 055a5e2b - fix something in executionMPI

    Compare with previous version

  • Author Developer

    It's probably possible to use it but I don't think it's necessary as we don't require any polymorphism and we know what communicator to use at compile time. Also for the non-blocking and wait calls it calls GetRequest every time which might have a small performance overhead.

  • BOYANG XIA added 1 commit

    added 1 commit

    • 27662e48 - revert last commit: avoid include comm.h

    Compare with previous version

  • BOYANG XIA added 1 commit

    added 1 commit

    Compare with previous version

  • Does anyone successfully build this branch (c++20) on Ubuntu PC? By default, Ubuntu comes with only g++-11. I tried to install g++-13 manually, but still unable to build it.

    The Kingscross machine only has g++-12, which does not fully support the c++20 feature, does it?

  • James Edgeley added 1 commit

    added 1 commit

    Compare with previous version

  • James Edgeley added 1 commit

    added 1 commit

    Compare with previous version

  • BOYANG XIA added 1 commit

    added 1 commit

    • 3ed31866 - Fix g++ error: ImplModule is parsed as a non-type, but instantiation yields a type

    Compare with previous version

  • James Edgeley added 2 commits

    added 2 commits

    Compare with previous version

  • James Edgeley added 1 commit

    added 1 commit

    Compare with previous version

  • James Edgeley added 1 commit

    added 1 commit

    Compare with previous version

  • James Edgeley added 1 commit

    added 1 commit

    Compare with previous version

    • @jedgeley I don't know if you have tried Linux successfully but I just struggled a lot to compile it. Last week I tried it and got errors like:

      log.txt

      To be honest, it's only one of the many errors!

    • Author Developer

      There were some cmake flags that I added last week which I think got it working for linux. I have switched to working on a branch in the main Nektar repository though so maybe try that.

    • Please register or sign in to reply
  • Please register or sign in to reply
    Loading