FieldConvert tests failing in Debug on Windows
With a Windows 10 MPI Debug build, three FieldConvert
tests are failing:
FieldConvert_Tet_channel_npart_wss
FieldConvert_chan3D_tec_par
FieldConvert_chan3D_vort_par
This doesn't seen to cause an issue on Windows 10 with a Release build or on Linux/Mac OS and was only encountered when running the Debug build via the CI system.
It seems that the problems with these tests are different. For the first one, FieldConvert_Tet_channel_npart_wss
, the error is an assertion error raised in vector
in the standard library caused by trying to access an element in an empty array, the stack trace is as follows:
LibUtilities-g.dll!std::vector<unsigned int,std::allocator<unsigned int>>::operator[](const unsigned __int64 _Pos) Line 1502 C++
LibUtilities-g.dll!Nektar::LibUtilities::CommDataTypeTraits<std::vector<unsigned int,std::allocator<unsigned int>>>::GetPointer(std::vector<unsigned int,std::allocator<unsigned int>> & val) Line 130 C++
LibUtilities-g.dll!Nektar::LibUtilities::Comm::Recv<std::vector<unsigned int,std::allocator<unsigned int>>>(int pProc, std::vector<unsigned int,std::allocator<unsigned int>> & pData) Line 261 C++
LibUtilities-g.dll!Nektar::LibUtilities::FieldIOXml::SetUpFieldMetaData(const std::string & outname, const std::vector<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>,std::allocator<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>>> & fielddefs, const std::map<std::string,std::string,std::less<std::string>,std::allocator<std::pair<std::string const ,std::string>>> & fieldmetadatamap) Line 735 C++
LibUtilities-g.dll!Nektar::LibUtilities::FieldIOXml::v_Write(const std::string & outFile, std::vector<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>,std::allocator<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>>> & fielddefs, std::vector<std::vector<double,std::allocator<double>>,std::allocator<std::vector<double,std::allocator<double>>>> & fielddata, const std::map<std::string,std::string,std::less<std::string>,std::allocator<std::pair<std::string const ,std::string>>> & fieldmetadatamap, const bool backup) Line 120 C++
LibUtilities-g.dll!Nektar::LibUtilities::FieldIO::Write(const std::string & outFile, std::vector<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>,std::allocator<std::shared_ptr<Nektar::LibUtilities::FieldDefinitions>>> & fielddefs, std::vector<std::vector<double,std::allocator<double>>,std::allocator<std::vector<double,std::allocator<double>>>> & fielddata, const std::map<std::string,std::string,std::less<std::string>,std::allocator<std::pair<std::string const ,std::string>>> & fieldinfomap, const bool backup) Line 324 C++
FieldUtils-g.dll!Nektar::FieldUtils::OutputFld::OutputFromExp(boost::program_options::variables_map & vm) Line 115 C++
FieldUtils-g.dll!Nektar::FieldUtils::OutputFileBase::Process(boost::program_options::variables_map & vm) Line 207 C++
FieldConvert-g.exe!RunModule(std::shared_ptr<Nektar::FieldUtils::Module> module, boost::program_options::variables_map & vm, bool verbose) Line 746 C++
FieldConvert-g.exe!main(int argc, char * * argv) Line 500 C++
With the assistance of @ccantwel, this was tracked down to an attempt to receive data into the array tmp
where tmp
has been created based on an elmtnums[i]
value that is 0. This seems to be the result of using a "mock parallel" version of CommSerial
to handle multiple partitions. A workaround to resolve this has been added in c989e570.
For FieldConvert_chan3D_tec_par
and FieldConvert_chan3D_vort_par
, the error is again a failed Debug assertion vector subscript out of range
in the core vector
code. This appears to be occurring when the data is initially read in as part of the InputXml phase. Within the PartitionMesh
function of MeshGraphXml.cpp
, L262, the keys
and vals
vectors seem to have a size of 0 which results in trying to access a pointer to the first element of an empty vector triggering the error. This error affects the main MPI process. The other processes are affected by a similar issue around L296 where keys
and vals
are again empty vectors as a result of bndSize
being 0.
This has been resolved in both cases by wrapping the comm->Bcast(<vector>, 0)
calls in an if
statement that only runs the Bcast
if the vector is not empty. This fixes both of the failing chan3D tests. This fix is in commit e9bea0db.
An ASSERTL1
assertion has also been added to the GetPointer
functions that accept a vector parameter in CommDataType.h
to flag such issues in future.