- additional Pstream::broadcasts() method to serialize/deserialize
multiple items.
- revoke the broadcast specialisations for std::string and List(s) and
use a generic broadcasting template. In most cases, the previous
specialisations would have required two broadcasts:
(1) for the size
(2) for the contiguous content.
Now favour reduced communication over potential local (intermediate)
storage that would have only benefited a few select cases.
ENH: refine PstreamBuffers access methods
- replace 'bool hasRecvData(label)' with 'label recvDataCount(label)'
to recover the number of unconsumed receive bytes from specified
processor. Can use 'labelList recvDataCounts()' to recover the
number of unconsumed receive bytes from all processor.
- additional peekRecvData() method (for transcribing contiguous data)
ENH: globalIndex whichProcID - check for isLocal first
- reasonable to assume that local items are searched for more
frequently, so do preliminary check for isLocal before performing
a more costly binary search of globalIndex offsets
ENH: masterUncollatedFileOperation - bundled scatter of status
- less communication than gatherList/scatterList
ENH: refine send granularity in Pstream::exchange
STYLE: ensure PstreamBuffers and defaultCommsType agree
- simpler loops for lduSchedule
- used Pstream::maxCommsSize (bytes) for the lower limit when sending.
This would have send more data on each iteration than expected based
on maxCommsSize and finish with a number of useless iterations.
Was generally not a serious bug since maxCommsSize (if used) was
likely still far away from the MPI limits and exchange() is primarily
harnessed by PstreamBuffers, which is sending character data
(ie, number of elements and number of bytes is identical).
- a somewhat specialized use case, but can be useful when there are
many ranks with sparse communication but for which the access
pattern is established during inner loops.
PstreamBuffers pBufs(Pstream::commsTypes::nonBlocking);
pBufs.allowClearRecv(false);
PtrList<OPstream> output(Pstream::nProcs());
while (condition)
{
// Rewind existing streams
forAll(output, proci)
{
auto* osptr = output.get(proci);
if (osptr)
{
(*osptr).rewind();
}
}
for (Particle& p : myCloud)
{
label toProci = ...;
// Get or create output stream
auto* osptr = output.get(toProci);
if (!osptr)
{
osptr = new OPstream(toProci, pBufs);
output.set(toProci, osptr);
}
// Append more data...
(*osptr) << p;
}
pBufs.finishedSends();
... reads
}
- split off a Pstream::genericBroadcast() which uses UOPBstream during
serialization and UOPBstream during de-serialization.
This function will not normally be used directly by callers, but
provides a base layer for higher-level broadcast calls.
- low-level UPstream broadcast of string content.
Since std::string has length and contiguous content, it is possible
to handle directly by the following:
1. broadcast size
2. resize
3. broadcast content when size != 0
Although this is a similar amount of communication as the generic
streaming version (min 1, max 2 broadcasts) it is more efficient
by avoiding serialization/de-serialization overhead.
- handle broadcast of List content distinctly.
Allows an optimized path for contiguous data, similar to how
std::string is handled (broadcast size, resize container, broadcast
content when size != 0), but can revert to genericBroadcast (streamed)
for non-contiguous data.
- make various scatter variants simple aliases for broadcast, since
that is what they are doing behind the scenes anyhow:
* scatter()
* combineScatter()
* listCombineScatter()
* mapCombineScatter()
Except scatterList() which remains somewhat different.
Beyond the additional (size == nProcs) check, the only difference to
using broadcast(List<T>&) or a regular scatter(List<T>&) is that
processor-local data is skipped. So leave this variant as-is.
STYLE: rename/prefix implementation code with 'Pstream'
- better association with its purpose and provides a unique name