Tidy MemoryRegion and introduce ReadOnly, WriteOnly, ReadWrite memory access qualifier
Issue/feature addressed
- Added
ReadOnly
,WriteOnly
,ReadWrite
memory access qualifier to theGetPtr
function - The
copyRegion
,copyArray
,copyVector
functions are now basically helper functions wrapped around the unifiedcopyFrom
function. AddedHostToHost
,HostToDevice
,DeviceToHost
,DeviceToDevice
template parameter option to thecopyFrom
function to allow user to fully control memory copy toMemoryRegion
. Default is set toHostToDevice
to maintain backward compatibility. - Added the
DeviceOnly
,HostOnly
,HostDevice
template parameter to theinitialize
function to allow user to fully controlMemoryRegion
initialization. - Unify all
MemoryRegionHost/MemoryRegionDevice
constructor into a single one, taking one data pointer and a size parameter. - Unify all creation method in
MemoryRegion
into afromData
creator, one data pointer and a size parameter. ThefromVector
andfromArray
are now simply "helper" functions that callfromData
. Note, the creation of aMemoryRegion
from aArray<OneD, Array<OneD, TDataIn>
is not efficient, but the use ofArray<OneD, Array<OneD, TDataIn>
should be eventually deprecated.
TODO (Future MR?):
- Add unit test for MemoryRegion
Proposed solution
- If GetPtr is called with the ReadOnly memory access qualifier, then a const ptr is returned. Memory copy between host/device or device/host will occur, if necessary.
- If GetPtr is called with the WriteOnly memory access qualifier, then a non-const ptr is returned. No-memory copy between host/device or device/host will occur.
- If GetPtr is called with the ReadWrite memory access qualifier, then a non-const ptr is returned. Memory copy between host/device or device/host will occur, if necessary.
Implementation
Note: Most of the proposed implementation relies on pre-existing infrastructure. It combines the previous GetPtr
and GetConstPtr
metafunctions and add a new WriteOnly
memory access.
- Introduce
ReadOnly
,WriteOnly
,ReadWrite
memory access qualifierwithin the MemoryQualifier namespace
class ReadOnly
{
};
class WriteOnly
{
};
class ReadWrite
{
};
- Introduce
const_if
metafunction. This metafunction allows the definition of a singleGetPtr
metafunction returning either aconst TData *
or aTData *
data type depending of the template type parameter. The implementation is based on a possible implementation of the C++ nativestd::enable_if
metafunction
template <bool B, class T = void> struct const_if
{
typedef T type;
};
template <class T> struct const_if<true, T>
{
typedef const T type;
};
- All const and non-const host/device pointers are now accessed using a single templated
GetPtr
metafunction:
template <typename MemSpace,
typename AccessQualifier = MemoryQualifier::ReadWrite>
typename const_if<
std::is_same<AccessQualifier, MemoryQualifier::ReadOnly>::value,
TData>::type *
GetPtr()
...
if constexpr (std::is_same<AccessQualifier,
MemoryQualifier::ReadOnly>::value)
{
return m_storage->GetHostConstPtr();
}
else if constexpr (std::is_same<AccessQualifier,
MemoryQualifier::WriteOnly>::value)
{
return m_storage->GetHostPtr(true);
}
else if constexpr (std::is_same<AccessQualifier,
MemoryQualifier::ReadWrite>::value)
{
return m_storage->GetHostPtr();
}
...
if constexpr (std::is_same<AccessQualifier,
MemoryQualifier::ReadOnly>::value)
{
return ret.GetDeviceConstPtr();
}
else if constexpr (std::is_same<
AccessQualifier,
MemoryQualifier::WriteOnly>::value)
{
return ret.GetDevicePtr(true);
}
else if constexpr (std::is_same<
AccessQualifier,
MemoryQualifier::ReadWrite>::value)
{
return ret.GetDevicePtr();
}
...
Note: ReadWrite
is the default access qualifier (this can help avoiding breaking to much legacy code, notably the unit tests)
- Within the
GetHostPtr
andGetDevicePtr
functions (inMemoryRegionHost
/MemoryRegionDevice
), prevent unnecessary device/host or host/device data copy if thewrite_only
flag is enable and setm_host_valid
/m_device_valid
flag to true.
TData *GetHostPtr(bool write_only) override
{
if (write_only)
{
this->m_host_valid = true;
}
else
{
DeviceToHost(); // Move to host if necessary
}
m_device_valid = false;
return this->m_host;
}
TData *GetDevicePtr(bool write_only = false)
{
if (write_only)
{
m_device_valid = true;
}
else
{
HostToDevice(); // Move to device if necessary
}
this->m_host_valid = false;
return m_device;
}
- To allow user to proper control memory initialization,
MemoryWrite
template parameter withDeviceOnly
,HostOnly
,HostDevice
options has been added to the initialize function. TheMemoryWrite
template parameter is ignored for pure host (CPU) code.
Note: The default value is set to DeviceOnly
to preserve backward compatibility.
template <typename MemWrite = MemoryWrite::DeviceOnly>
void initialize(TData val, size_t count = 0, size_t offset = 0)
{
...
}
- Added a
MemoryCopy
template parameter with optionHostToHost
,HostToDevice
,DeviceToHost
,DeviceToDevice
to thecopyFrom
function to allow user to fully control memory copy toMemoryRegion
.
Note: Default is set to HostToDevice
to preserve backward compatibility.
template <typename MemSpace, typename TDataIn = TData,
typename MemCopy = MemoryCopy::HostToDevice>
void copyFrom(const TDataIn *src, const size_t size,
const size_t offset = 0)
{
...
}
Example:
One example of application of the WriteOnly
memory access is the BwdTrans
operator. For the BwdTrans
operator, the out
field does not need to by copied from host to device. This can now be achieved the following way
void apply(Field<TData, FieldState::Coeff> &in,
Field<TData, FieldState::Phys> &out) override
{
// Initialize pointers.
auto *inPtr = in.template GetPtr<MemSpace, MemoryQualifier::ReadOnly>();
auto *outPtr =
out.template GetPtr<MemSpace, MemoryQualifier::WriteOnly>();
...
Tests
Suggested reviewers
Notes
Please add any other information that could be useful for reviewers.
Checklist
-
Functions and classes, or changes to them, are documented. [ ] User guide/documentation is updated.[ ] Changelog is updated.[ ] Suitable tests added for new functionality.-
Contributed code is correctly formatted. (See the contributing guidelines). [ ] License added to any new files.-
No extraneous files have been added (e.g. compiler output or test data files).