Commit Graph

5 Commits

Author SHA1 Message Date
Mark Olesen
18e0d7e4d6 ENH: bundle broadcasts (#2371)
- additional Pstream::broadcasts() method to serialize/deserialize
  multiple items.

- revoke the broadcast specialisations for std::string and List(s) and
  use a generic broadcasting template. In most cases, the previous
  specialisations would have required two broadcasts:
    (1) for the size
    (2) for the contiguous content.

  Now favour reduced communication over potential local (intermediate)
  storage that would have only benefited a few select cases.

ENH: refine PstreamBuffers access methods

- replace 'bool hasRecvData(label)' with 'label recvDataCount(label)'
  to recover the number of unconsumed receive bytes from specified
  processor.  Can use 'labelList recvDataCounts()' to recover the
  number of unconsumed receive bytes from all processor.

- additional peekRecvData() method (for transcribing contiguous data)

ENH: globalIndex whichProcID - check for isLocal first

- reasonable to assume that local items are searched for more
  frequently, so do preliminary check for isLocal before performing
  a more costly binary search of globalIndex offsets

ENH: masterUncollatedFileOperation - bundled scatter of status
2022-04-29 11:44:28 +02:00
Mark Olesen
0cf02eb384 ENH: globalIndex with direct gather/broadcast
- less communication than gatherList/scatterList

ENH: refine send granularity in Pstream::exchange

STYLE: ensure PstreamBuffers and defaultCommsType agree

- simpler loops for lduSchedule
2022-03-12 21:16:29 +01:00
Mark Olesen
341d9c402d BUG: incorrect chunk handling in Pstream::exchange (fixes #2375)
- used Pstream::maxCommsSize (bytes) for the lower limit when sending.
  This would have send more data on each iteration than expected based
  on maxCommsSize and finish with a number of useless iterations.

  Was generally not a serious bug since maxCommsSize (if used) was
  likely still far away from the MPI limits and exchange() is primarily
  harnessed by PstreamBuffers, which is sending character data
  (ie, number of elements and number of bytes is identical).
2022-03-04 17:49:23 +00:00
Mark Olesen
bfca84d11a ENH: implement OPstream rewind() to support reuse of output streams
- a somewhat specialized use case, but can be useful when there are
  many ranks with sparse communication but for which the access
  pattern is established during inner loops.

      PstreamBuffers pBufs(Pstream::commsTypes::nonBlocking);
      pBufs.allowClearRecv(false);

      PtrList<OPstream> output(Pstream::nProcs());

      while (condition)
      {
          // Rewind existing streams
          forAll(output, proci)
          {
              auto* osptr = output.get(proci);
              if (osptr)
              {
                  (*osptr).rewind();
              }
          }

          for (Particle& p : myCloud)
          {
              label toProci = ...;

              // Get or create output stream
              auto* osptr = output.get(toProci);
              if (!osptr)
              {
                  osptr = new OPstream(toProci, pBufs);
                  output.set(toProci, osptr);
              }

              // Append more data...
              (*osptr) << p;
          }

          pBufs.finishedSends();

          ... reads
      }
2022-03-04 17:49:23 +00:00
Mark Olesen
c086f22298 ENH: extend/improve broadcast handling
- split off a Pstream::genericBroadcast() which uses UOPBstream during
  serialization and UOPBstream during de-serialization.
  This function will not normally be used directly by callers, but
  provides a base layer for higher-level broadcast calls.

- low-level UPstream broadcast of string content.
  Since std::string has length and contiguous content, it is possible
  to handle directly by the following:
     1. broadcast size
     2. resize
     3. broadcast content when size != 0

  Although this is a similar amount of communication as the generic
  streaming version (min 1, max 2 broadcasts) it is more efficient
  by avoiding serialization/de-serialization overhead.

- handle broadcast of List content distinctly.
  Allows an optimized path for contiguous data, similar to how
  std::string is handled (broadcast size, resize container, broadcast
  content when size != 0), but can revert to genericBroadcast (streamed)
  for non-contiguous data.

- make various scatter variants simple aliases for broadcast, since
  that is what they are doing behind the scenes anyhow:

    * scatter()
    * combineScatter()
    * listCombineScatter()
    * mapCombineScatter()

  Except scatterList() which remains somewhat different.
  Beyond the additional (size == nProcs) check, the only difference to
  using broadcast(List<T>&) or a regular scatter(List<T>&) is that
  processor-local data is skipped. So leave this variant as-is.

STYLE: rename/prefix implementation code with 'Pstream'

- better association with its purpose and provides a unique name
2022-03-04 17:49:23 +00:00