- "buffered" corresponds to MPI_Bsend (buffered send),
whereas the old name "blocking" is misleading since the
regular MPI_Send also blocks until completion
(ie, buffer can be reused).
ENH: IPstream::read() returns std::streamsize instead of label (#3152)
- previously returned a 'label' but std::streamsize is consistent with
the input parameter and will help with later adjustments.
- use <label> instead of <int> for internal accounting of the message
size, for consistency with the underyling List<char> buffers used.
- improve handling for corner case of IPstream receive with
non-blocking, although this combination is not used anywhere
ENH: eliminate unnecessary duplicate communicator
- in globalMeshData previously had a comm_dup hack to avoid clashes
with deltaCoeffs calculations. However, this was largely due to a
manual implementation of reduce() that used point-to-point
communication. This has since been updated to use an MPI_Allreduce
and now an MPI_Allgather, neither of which need this hack.
- can use UList signature since the routines do not resize the list
or attempt to broadcast it: useful for SubList handling.
ENH: add IPstream/OPstream send/recv static methods
- in most cases can simply construct mapDistribute with the sendMap
and have it take care of communication and addressing for the
corresponding constructMap.
This removes code duplication, which in some cases was also using
much less efficient mechanisms (eg, combineReduce on list of
lists, or an allGatherList on the send sizes etc) and also
reduces the number of places where Pstream::exchange/exchangeSizes
is being called.
ENH: reduce communication in turbulentDFSEMInlet
- was doing an allGatherList to populate a mapDistribute.
Now simply use PstreamBuffers mechanisms directly.
- returns a range of `int` values that can be iterated across.
For example,
for (const int proci : Pstream::subProcs()) { ... }
instead of
for
(
int proci = Pstream::firstSlave();
proci <= Pstream::lastSlave();
++proci
)
{
...
}
- makes the intent clearer and avoids the need for additional
constructor casting. Eg,
labelList(10, Zero) vs. labelList(10, 0)
scalarField(10, Zero) vs. scalarField(10, scalar(0))
vectorField(10, Zero) vs. vectorField(10, vector::zero)
This class is largely a pre-C++11 holdover. It is now possible to
simply use move construct/assignment directly.
In a few rare cases (eg, polyMesh::resetPrimitives) it has been
replaced by an autoPtr.
- only affects transfer of C-style string with a single character
remaining after whitespace stripping. Test added into Test-parallel.
- Note some idiosyncrasies in the behaviour:
send | receives
-------------------------+-------------------------
string("a b c") | string "a b c"
string("a") | string "a"
"a b c" | word "abc"
'd' | char 'd'
"d" | char 'd'
"d " | char 'd'
Contributed by Mattijs Janssens.
1. Any non-blocking data exchange needs to know in advance the sizes to
receive so it can size the buffer. For "halo" exchanges this is not
a problem since the sizes are known in advance but or all other data
exchanges these sizes need to be exchanged in advance.
This was previously done by having all processors send the sizes of data to
send to the master and send it back such that all processors
- had the same information
- all could work out who was sending what to where and hence what needed to
be received.
This is now changed such that we only send the size to the
destination processor (instead of to all as previously). This means
that
- the list of sizes to send is now of size nProcs v.s. nProcs*nProcs before
- we cut out the route to the master and back by using a native MPI
call
It causes a small change to the API of exchange and PstreamBuffers -
they now return the sizes of the local buffers only (a labelList) and
not the sizes of the buffers on all processors (labelListList)
2. Reversing the order of the way in which the sending is done when
scattering information from the master processor to the other
processors. This is done in a tree like fashion. Each processor has a
set of processors to receive from/ send to. When receiving it will
first receive from the processors with the least amount of
sub-processors (i.e. the ones which return first). When sending it
needs to do the opposite: start sending to the processor with the
most amount of sub-tree since this is the critical path.
- change system/controlDict to use functions {..} instead of functions (..);
* This is internally more efficient
- fixed formatting of system/controlDict functions entry
- pedantic change: use 'return 0' instead of 'return(0)' in the applications,
since return is a C/C++ keyword, not a function.