- previously automatically skipped the first communicator (which was
assumed to be MPI_COMM_WORLD), but now simply rely on the
internal pendingMPIFree_ to track which communicators have actually
been allocated.
- eliminate ClassName in favour of simple debug
- include Apple-specific FPE handling after local definition
to allow for more redefinitions
COMP: remove stray <csignal> includes
- freeCommmunicatorComponents needs an additional bounds check.
When MPI is initialized outside of OpenFOAM, there are no
UPstream communicator equivalents
- MPI_THREAD_MULTIPLE is usually undesirable for performance reasons,
but in some cases may be necessary if a linked library expects it.
Provide a '-mpi-threads' option to explicitly request it.
ENH: consolidate some looping logic within argList
- with C++11, static constexpr variables apparently also require
definition in a translation unit and not just as inlined quantities.
Mostly not an issue, however gcc with -O0 does not do the inlining
and thus actually requires them to be defined in a translation unit
as well.
These variables were provided for symmetry with worldComm, but only
used in low-level internal code. Changing to inlined functions
solves the linkage issue and also aligns with the commWorld()
function naming.
Mnemonics:
MPI_COMM_SELF => UPstream::commSelf()
overall MPI_COMM_WORLD => UPstream::commGlobal(), sometimes commWorld()
local COMM_WORLD => UPstream::commWorld()
UPstream::allocateCommunicator
- with contiguous sub-procs. Simpler, more compact handling, ranks
are guaranteed to be monotonic
UPstream::commWorld(label)
- ignore placeholder values, prevents accidental negative values
- make communicator non-optional for UPstream::broadcast(), which
means it has three mandatory parameters and thus always fully
disambiguated from Pstream::broadcast().
ENH: relax size checking on gatherList/scatterList
- only fatal if the List size is less than nProcs.
Can silent ignore any trailing elements: they will be untouched.
- UPstream exit with a non-zero return code is raised by things like
exit(FatalError) which means there is no reason to believe that
any/all of the buffered sends, requests etc have completed.
Thus avoid detaching buffers, freeing communicators etc in this
situation. This makes exit(1) behave much more like abort(), but
without any stack trace. Should presumably help with avoiding
deadlocks on exit.
ENH: support transfer from a wrapped MPI request to global list
- allows coding with a list UPstream::Request and subsequently either
retain that list or transfer into the global list.
- simplifies communication structuring with intra-host communication.
Can be used for IO only, or for specialised communication.
Demand-driven construction. Gathers the SHA1 of host names when
determining the connectivity. Internally uses an MPI_Gather of the
digests and a MPI_Bcast of the unique host indices.
NOTE:
does not use MPI_Comm_splt or MPI_Comm_splt_type since these
return MPI_COMM_NULL on non-participating process which does not
easily fit into the OpenFOAM framework.
Additionally, if using the caching version of
UPstream::commInterHost() and UPstream::commIntraHost()
the topology is determined simultaneously
(ie, equivalent or potentially lower communication).
- make sizing of commsStruct List demand-driven as well
for more robustness, fewer unneeded allocations.
- fix potential latent bug with allBelow/allNotBelow for proc 0
(linear communication).
ENH: remove unused/unusable UPstream::communicator optional parameter
- had constructor option to avoid constructing the MPI backend,
but this is not useful and inconsistent with what the reset or
destructor expect.
STYLE: local use of UPstream::communicator
- automatically frees communicator when it leaves scope
- these are primarily when encountering sparse (eg, inter-host)
communicators. Additional UPstream convenience methods:
is_rank(comm)
=> True if process corresponds to a rank in the communicators.
Can be a master rank or a sub-rank.
is_parallel(comm)
=> True if parallel algorithm or exchange is used on the process.
same as
(parRun() && (nProcs(comm) > 1) && is_rank(comm))
- previously had an additional stack for freedRequests_,
which were used to 'remember' locations into the list of
outstandingRequests_ that were handled by 'waitRequest()'.
This was principally done for sanity checks on shutdown,
but we now just test for any outstanding requests that
are *not* MPI_REQUEST_NULL instead (much simpler).
The framework with freedRequests_ also had a provision to 'recycle'
them by popping from that stack, but this is rather fragile since it
would only triggered by some collectives
(MPI_Iallreduce, MPI_Ialltoall, MPI_Igather, MPI_Iscatter)
with no guarantee that these will all be properly removed again.
There was also no pruning of extraneous indices.
ENH: consolidate internal reset/push of requests
- replace duplicate code with inline functions
reset_request(), push_request()
ENH: null out trailing requests
- extra safety (paranoia) for the UPstream::Request versions
of finishedRequests(), waitAnyRequest()
CONFIG: document nPollProcInterfaces in etc/controlDict
- still experimental, but at least make the keyword known
- mechanism has been unused for at least a decade or more
(or was never used). Message tags are assigned on an ad hoc basis
locally when collision avoidance is necessary.
- separate broadcast times from reduce/gather/scatter time
- separate wait times from all-to-all time
- support invocation counts, split off requests time/count
from others to avoid flooding the counts
- support 'detail' switch to increase the output information.
Format may change in the future
- attempted reduction in bookkeeping (commit: 068ab8ccc7) meant that
the worldComm didn't have a group from which sub-communicators could
be spun off.
- do not force reset of PstreamBuffers positions
STYLE: UPstream::globalComm instead of '0'
- permits distinction between communicators/groups that were
user-created (eg, MPI_Comm_create) versus those queried from MPI.
Previously simply relied on non-null values, but that is too fragile
ENH: support List<Request> version of UPstream::finishedRequests
- allows more independent algorithms
ENH: added UPstream::probeMessage(...). Blocking or non-blocking
- UPstream::Request wrapping class provides an opaque wrapper for
vendor MPI_Request values, independent of global lists.
ENH: support for MPI barrier (blocking or non-blocking)
ENH: support for MPI sync-send variants
STYLE: deprecate waitRequests() without a position parameter
- in many cases this can indicate a problem in the program logic since
normally the startOfRequests should be tracked locally.
- now simply a no-op for out-of-range values (instead of an error),
which simplifies the calling code.
Previously
==========
if (request_ >= 0 && request_ < UPstream::nRequests())
{
UPstream::waitRequest(request_);
}
Updated
=======
UPstream::waitRequest(request_);
- when 'recycling' freed request indices, ensure they are actually
within the currently addressable range
- MPI finalization now checks outstanding requests against
MPI_REQUEST_NULL to verify that they have been waited or tested on.
Previously simply checked against freed request indices
ENH: consistent initialisation of send/receive bookkeeping
- UPstream::globalComm constant always refers to MPI_COMM_WORLD but
UPstream::worldComm could be MPI_COMM_WORLD (single world)
or a dedicated local communicator (for multi-world).
- provide a Pstream wrapped version of MPI_COMM_SELF,
references as UPstream::selfComm
- UPstream::isUserComm(label)
test for additional user-defined communicators
- simplifies coding
* finishedRequest(), waitRequest(), waitRequests() with parRun guards
* nRequests() is noexcept
- more consistent use of UPstream::defaultCommsType in branching
- additional Pstream::broadcasts() method to serialize/deserialize
multiple items.
- revoke the broadcast specialisations for std::string and List(s) and
use a generic broadcasting template. In most cases, the previous
specialisations would have required two broadcasts:
(1) for the size
(2) for the contiguous content.
Now favour reduced communication over potential local (intermediate)
storage that would have only benefited a few select cases.
ENH: refine PstreamBuffers access methods
- replace 'bool hasRecvData(label)' with 'label recvDataCount(label)'
to recover the number of unconsumed receive bytes from specified
processor. Can use 'labelList recvDataCounts()' to recover the
number of unconsumed receive bytes from all processor.
- additional peekRecvData() method (for transcribing contiguous data)
ENH: globalIndex whichProcID - check for isLocal first
- reasonable to assume that local items are searched for more
frequently, so do preliminary check for isLocal before performing
a more costly binary search of globalIndex offsets
ENH: masterUncollatedFileOperation - bundled scatter of status
- less communication than gatherList/scatterList
ENH: refine send granularity in Pstream::exchange
STYLE: ensure PstreamBuffers and defaultCommsType agree
- simpler loops for lduSchedule
- native MPI min/max/sum reductions for float/double
irrespective of WM_PRECISION_OPTION
- native MPI min/max/sum reductions for (u)int32_t/(u)int64_t types,
irrespective of WM_LABEL_SIZE
- replace rarely used vector2D sum reduction with FixedList as a
indicator of its intent and also generalizes to different lengths.
OLD:
vector2D values; values.x() = ...; values.y() = ...;
reduce(values, sumOp<vector2D>());
NEW:
FixedList<scalar,2> values; values[0] = ...; values[1] = ...;
reduce(values, sumOp<scalar>());
- allow returnReduce() to use native reductions. Previous code (with
linear/tree selector) would have bypassed them inadvertently.
ENH: added support for MPI broadcast (for a memory span)
ENH: select communication schedule as a static method
- UPstream::whichCommunication(comm) to select linear/tree
communication instead of ternary or
if (Pstream::nProcs() < Pstream::nProcsSimpleSum) ...
STYLE: align nProcsSimpleSum static value with etc/controlDict override
- partial revert for 13740de427 (#2158)
MS-MPI does not currently have a MPI_Comm_create_group(),
so keep using MPI_Comm_create() there.
Only affects multi-world simulations.
CONFIG: retain dummy version of libPstream.dll
- retain as libPstream.dll-dummy so that it is available for
manual replacement of the regular libPstream.dll (#2290)
Keep extra copy of libPstream.dll as libPstream.dll-msmpi
(for example) for manual replacement.
- UPstream::mpiGather (MPI_Gather) - used by Pstream::listGatherValues
- UPstream::mpiScatter (MPI_Scatter) - used by Pstream::listScatterValues
These are much simpler forms for gather/scatter of fixed-sized
contiguous types data types (eg, primitives, simple pairs etc).
In the gather form, creates a list of gathered values on the master
process. The subranks have a list size of zero.
Similarly, scatter will distribute a list of values to single values
on each process.
Instead of
labelList sendSizes(Pstream::nProcs());
sendSizes[Pstream::myProcNo()] = sendData.size();
Pstream::gatherList(sendSizes);
Can write
const labelList sendSizes
(
UPstream::listGatherValues<label>(sendData.size())
);
// Less code, lower overhead and list can be const.
For scattering an individual value only,
instead of
labelList someValues;
if (Pstream::master()) someValues = ...;
Pstream::gatherList(sendSizes);
const label localValue
(
someValues[Pstream::myProcNo()]
);
Can write
labelList someValues;
if (Pstream::master()) someValues = ...;
Pstream::gatherList(sendSizes);
const label localValue
(
UPstream::listScatterValues<label>(someValues)
);
Can of course also mix listGatherValues to assemble a list on master
and use Pstream::scatterList to distribute.
ENH: adjusted globalIndex gather methods
- added mpiGather() method [contiguous data only] using MPI_Gatherv
- respect localSize if gathering master data to ensure that a
request for 0 master elements is properly handled.
- for use when the is_contiguous check has already been done outside
the loop. Naming as per std::span.
STYLE: use data/cdata instead of begin
ENH: replace random_shuffle with shuffle, fix OSX int64 ambiguity
- previously used a Pstream::exit() invoked from the argList
destructor to handle all MPI shutdown, but this has the unfortunate
side-effect of using a fixed return value for the program exit.
Instead use the Pstream::shutdown() method in the destructor and allow
the normal program exit codes as usual. This means that the
following code now works as expected.
```
argList args(...);
if (...)
{
InfoErr<< "some error\n";
return 1;
}
```