- bundles frequently used 'gather/scatter' patterns more consistently.
- combineAllGather -> combineGather + broadcast
- listCombineAllGather -> listCombineGather + broadcast
- mapCombineAllGather -> mapCombineGather + broadcast
- allGatherList -> gatherList + scatterList
- reduce -> gather + broadcast (ie, allreduce)
- The allGatherList currently wraps gatherList/scatterList, but may be
replaced with a different algorithm in the future.
STYLE: PstreamCombineReduceOps.H is mostly unneeded now
- gather/scatter types of operations can avoid AllToAll communication
and use simple MPI gather (or scatter) to establish the receive sizes.
New methods: finishedGathers() / finishedScatters()
- do not need contruct or move assign from SortableList.
Rarely (never) used and can simply treat like a normal list
by applying shrink beforehand.
- make append() methods return void instead of returning self, which
makes it easier to derive from. Having them return self was a bit of
an original design mistake.
Chaining appends do not actually occur anywhere. Even if they were
to be used, would not want to rely on them (fear of slicing on any
derived classes).
BUG: IndirectList iterator comparison loses constness
- eliminate redundant size_ accounting
- drop extra 'Container' template parameter and replace functionality
with more flexible pack/unpack methods.
There is also a pack() method that handles indirect lists of lists
that can be used, for example, to pack a patch slice of faces.
Drop the 'operator()' method in favour of unpack to expose and properly
document the conversion. Should revisit the corresponding code in
some places for optimization potential.
- align some method names with globalIndex:
totalSize(), maxSize() etc
- less communication than gatherList/scatterList
ENH: refine send granularity in Pstream::exchange
STYLE: ensure PstreamBuffers and defaultCommsType agree
- simpler loops for lduSchedule
- the internal data are contiguous so can broadcast size and internals
directly without an intermediate stream.
ENH: split out broadcast time for profilingPstream information
STYLE: minor Pstream cleanup
- UPstream::commsType_ from protected to private, since it already has
inlined noexcept getters/setters that should be used.
- don't pass unused/unneed tag into low-level MPI reduction templates.
Document where tags are not needed
- had Pstream::broadcast instead of UPstream::broadcast in internals
- used Pstream::maxCommsSize (bytes) for the lower limit when sending.
This would have send more data on each iteration than expected based
on maxCommsSize and finish with a number of useless iterations.
Was generally not a serious bug since maxCommsSize (if used) was
likely still far away from the MPI limits and exchange() is primarily
harnessed by PstreamBuffers, which is sending character data
(ie, number of elements and number of bytes is identical).
- PstreamBuffers nProcs() and allProcs() methods to recover the rank
information consistent with the communicator used for construction
- allowClearRecv() methods for more control over buffer reuse
For example,
pBufs.allowClearRecv(false);
forAll(particles, particlei)
{
pBufs.clear();
fill...
read via IPstream(..., pBufs);
}
This preserves the receive buffers memory allocation between calls.
- finishedNeighbourSends() method as compact wrapper for
finishedSends() when send/recv ranks are identically
(eg, neighbours)
- hasSendData()/hasRecvData() methods for PstreamBuffers.
Can be useful for some situations to skip reading entirely.
For example,
pBufs.finishedNeighbourSends(neighProcs);
if (!returnReduce(pBufs.hasRecvData(), orOp<bool>()))
{
// Nothing to do
continue;
}
...
On an individual basis:
for (const int proci : pBufs.allProcs())
{
if (pBufs.hasRecvData(proci))
{
...
}
}
Also conceivable to do the following instead (nonBlocking only):
if (!returnReduce(pBufs.hasSendData(), orOp<bool>()))
{
// Nothing to do
pBufs.clear();
continue;
}
pBufs.finishedNeighbourSends(neighProcs);
...
- split off a Pstream::genericBroadcast() which uses UOPBstream during
serialization and UOPBstream during de-serialization.
This function will not normally be used directly by callers, but
provides a base layer for higher-level broadcast calls.
- low-level UPstream broadcast of string content.
Since std::string has length and contiguous content, it is possible
to handle directly by the following:
1. broadcast size
2. resize
3. broadcast content when size != 0
Although this is a similar amount of communication as the generic
streaming version (min 1, max 2 broadcasts) it is more efficient
by avoiding serialization/de-serialization overhead.
- handle broadcast of List content distinctly.
Allows an optimized path for contiguous data, similar to how
std::string is handled (broadcast size, resize container, broadcast
content when size != 0), but can revert to genericBroadcast (streamed)
for non-contiguous data.
- make various scatter variants simple aliases for broadcast, since
that is what they are doing behind the scenes anyhow:
* scatter()
* combineScatter()
* listCombineScatter()
* mapCombineScatter()
Except scatterList() which remains somewhat different.
Beyond the additional (size == nProcs) check, the only difference to
using broadcast(List<T>&) or a regular scatter(List<T>&) is that
processor-local data is skipped. So leave this variant as-is.
STYLE: rename/prefix implementation code with 'Pstream'
- better association with its purpose and provides a unique name
- The idea of broadcast streams is to replace multiple master to
subProcs communications with a single MPI_Bcast.
if (Pstream::master())
{
OPBstream toAll(Pstream::masterNo());
toAll << data;
}
else
{
IPBstream fromMaster(Pstream::masterNo());
fromMaster >> data;
}
// vs.
if (Pstream::master())
{
for (const int proci : Pstream::subProcs())
{
OPstream os(Pstream::commsTypes::scheduled, proci);
os << data;
}
}
else
{
IPstream is(Pstream::commsTypes::scheduled, Pstream::masterNo());
is >> data;
}
Can simply use UPstream::broadcast() directly for contiguous data
with known lengths.
Based on ideas from T.Aoyagi(RIST), A.Azami(RIST)
- native MPI min/max/sum reductions for float/double
irrespective of WM_PRECISION_OPTION
- native MPI min/max/sum reductions for (u)int32_t/(u)int64_t types,
irrespective of WM_LABEL_SIZE
- replace rarely used vector2D sum reduction with FixedList as a
indicator of its intent and also generalizes to different lengths.
OLD:
vector2D values; values.x() = ...; values.y() = ...;
reduce(values, sumOp<vector2D>());
NEW:
FixedList<scalar,2> values; values[0] = ...; values[1] = ...;
reduce(values, sumOp<scalar>());
- allow returnReduce() to use native reductions. Previous code (with
linear/tree selector) would have bypassed them inadvertently.
ENH: added support for MPI broadcast (for a memory span)
ENH: select communication schedule as a static method
- UPstream::whichCommunication(comm) to select linear/tree
communication instead of ternary or
if (Pstream::nProcs() < Pstream::nProcsSimpleSum) ...
STYLE: align nProcsSimpleSum static value with etc/controlDict override
ENH: provide fieldTypes::surface names (as per fieldTypes::volume)
ENH: reduce number of files for surface fields
- combine face and point field declarations/definitions,
simplify typeName definitions
- used low-level MPI gather, but the wrapping routine contains an
additional safety check for is_contiguous which is not defined for
various std::pair<..> combination.
So std::pair<label,vector> (which is actually contiguous, but not
declared as is_contiguous) would falsely trip the check.
Avoid by simply gathering unbundled values instead.
- do not need STRINGIFY macros in ragel code
- remove wordPairHashTable.H and use equivalent wordPairHashes.H instead
STYLE: replace addDictOption with explicit option
- the usage text is otherwise misleading
GIT: combine Pair/Tuple2 directories
- unused in regular OpenFOAM code
- POSIX version uses deprecated gethostbyname()
- Windows version never worked
COMP: localize, noexcept on internal OSspecific methods
STYLE: support fileName::Type SYMLINK and LINK as synonyms
- for contiguous data, added mpiGatherOp() to complement the
gatherOp() static method
- the gather ops (static methods) populate the globalIndex on the
master only (not needed on other procs) for reduced communication
- rename inplace gather methods to include 'inplace' in their name.
Regular gather methods return the gathered data directly, which
allows the following:
const scalarField mergedWeights(globalFaces().gather(wghtSum));
vs.
scalarField mergedWeights;
globalFaces().gather(wghtSum, mergedWeights());
or even:
scalarField mergedWeights;
List<scalarField> allWeights(Pstream::nProcs());
allWeights[Pstream::myProcNo()] = wghtSum;
Pstream::gatherList(allWeights);
if (Pstream::master())
{
mergedWeights =
ListListOps::combine<scalarField>
(
allWeights, accessOp<scalarField>()
);
}
- add parRun guards on various globalIndex gather methods
(simple copies or no-ops in serial) to simplify the effort for callers.
ENH: reduce code effort for clearing linked-lists
ENH: adjust linked-list method name
- complement linked-list append() method with prepend() method
instead of 'insert', which is not very descriptive
- similar idea to swak timelines/lookuptables but combined together
and based on Function1 for more flexibility.
Specified as 'functions<scalar>' or 'functions<vector>'.
For example,
functions<scalar>
{
intakeType table ((0 0) (10 1.2));
p_inlet
{
type sine;
frequency 3000;
scale 50;
level 101325;
}
}
These can be referenced in the expressions as a nullary function or a
unary function.
Within the parser, the names are prefixed with "fn:" (function).
It is thus possible to define "fn:sin()" that is different than
the builtin "sin()" function.
* A nullary call uses time value
- Eg, fn:p_inlet()
* A unary call acts as a remapper function.
- Eg, fn:intakeType(6.25)
- previously simply reused the scan token, which works fine for
non-nested tokenizations but becomes too fragile with nesting.
Now changed to use tagged unions that can be copied about
and still retain some rudimentary knowledge of their types,
which can be manually triggered with a destroy() call.
- provide an 'identifier' non-terminal as an additional catch
to avoid potential leakage on parsing failure.
- adjust lemon rules and infrastructure:
- use %token to predefine standard tokens.
Will reduce some noise on the generated headers by retaining the
order on the initial token names.
- Define BIT_NOT, internal token rename NOT -> LNOT
- handle non-terminal vector values.
Support vector::x, vector::y and vector::z constants
- permit fieldExpr access to time().
Probably not usable or useful for an '#eval' expression,
but useful for a Function1.
- provisioning for hooks into function calls. Establishes token
names for next commit(s).
- use `#word` to concatenate, expand content with the resulting string
being treated as a word token. Can be used in dictionary or
primitive context.
In dictionary context, it fills the gap for constructing dictionary
names on-the-fly. For example,
```
#word "some_prefix_solverInfo_${application}"
{
type solverInfo;
libs (utilityFunctionObjects);
...
}
```
The '#word' directive will automatically squeeze out non-word
characters. In the block content form, it will also strip out
comments. This means that this type of content should also work:
```
#word {
some_prefix_solverInfo
/* Appended with application name (if defined) */
${application:+_} // Use '_' separator
${application} // The application
}
{
type solverInfo;
libs (utilityFunctionObjects);
...
}
```
This is admittedly quite ugly, but illustrates its capabilities.
- use `#message` to report expanded string content to stderr.
For example,
```
T
{
solver PBiCG;
preconditioner DILU;
tolerance 1e-10;
relTol 0;
#message "using solver: $solver"
}
```
Only reports on the master node.
- use FACE_DATA (was SURFACE_DATA) for similarity with polySurface
ENH: add expression value enumerations and traits
- simple enumeration of standard types (bool, label, scalar, vector)
that can be used as a value type-code for internal bookkeeping.
GIT: relocate pTraits into general traits/ directory
- releases ownership of the pointer. A no-op (and returns nullptr)
for references.
Naming consistent with unique_ptr and autoPtr.
DOC: adjust wording for memory-related classes
- add is_const() method for tmp, refPtr.
Drop (ununsed and confusing looking) isTmp method from refPtr
in favour of is_pointer() or movable() checks
ENH: noexcept for some pTraits methods, remove redundant 'inline'
- test for const first for tmp/refPtr (simpler logic)
- similar to -dry-run handling, can be interrogated from argList,
which makes it simpler to add into utilities.
- support multiple uses of -dry-run and -verbose to increase the
level. For example, could have
someApplication -verbose -verbose
and inside of the application:
if (args.verbose() > 2) ...
BUG: error with empty distributed roots specification (fixes#2196)
- previously used the size of distributed roots to transmit if the
case was running in distributed mode, but this behaves rather poorly
with bad input. Specifically, the following questionable setup:
distributed true;
roots ( /*none*/ );
Now transmit the ParRunControl distributed() value instead,
and also emit a gentle warning for the user:
WARNING: running distributed but did not specify roots!
- provide a plain stream() method on messageStream to reduce reliance
on casting operators and slightly opaque operator()() calls etc
- support alternative stream for messageStream serial output.
This can be used to support local redirection of output.
For example,
refPtr<OFstream> logging; // or autoPtr, unique_ptr etc
// Later...
Info.stream(logging.get())
<< "Detailed output ..." << endl;
This will use the stdout semantics in the normal case, or allow
redirection to an output file if a target output stream is defined,
but still effectively use /dev/null on non-master processes.
This is mostly the same as this ternary
(logging ? *logging : Info())
except that the ternary could be incorrect on sub-processes,
requires more typing etc.
ENH: use case-relative names of dictionary, IOstream for FatalIOError
- normally yields more easily understandable information
- UPstream::mpiGather (MPI_Gather) - used by Pstream::listGatherValues
- UPstream::mpiScatter (MPI_Scatter) - used by Pstream::listScatterValues
These are much simpler forms for gather/scatter of fixed-sized
contiguous types data types (eg, primitives, simple pairs etc).
In the gather form, creates a list of gathered values on the master
process. The subranks have a list size of zero.
Similarly, scatter will distribute a list of values to single values
on each process.
Instead of
labelList sendSizes(Pstream::nProcs());
sendSizes[Pstream::myProcNo()] = sendData.size();
Pstream::gatherList(sendSizes);
Can write
const labelList sendSizes
(
UPstream::listGatherValues<label>(sendData.size())
);
// Less code, lower overhead and list can be const.
For scattering an individual value only,
instead of
labelList someValues;
if (Pstream::master()) someValues = ...;
Pstream::gatherList(sendSizes);
const label localValue
(
someValues[Pstream::myProcNo()]
);
Can write
labelList someValues;
if (Pstream::master()) someValues = ...;
Pstream::gatherList(sendSizes);
const label localValue
(
UPstream::listScatterValues<label>(someValues)
);
Can of course also mix listGatherValues to assemble a list on master
and use Pstream::scatterList to distribute.
ENH: adjusted globalIndex gather methods
- added mpiGather() method [contiguous data only] using MPI_Gatherv
- respect localSize if gathering master data to ensure that a
request for 0 master elements is properly handled.
- the size of a List often requires adjustment prior to an operation,
but old values (if any) are not of interest and will be overwritten.
In these cases can use the _nocopy versions to avoid additional memory
overhead of the intermediate list and the copy/move overhead of
retaining the old values (that we will subsequently discard anyhow).
No equivalent for PtrList/UPtrList - this would be too fragile.
- add swap DynamicField with DynamicList
BUG: fixed Dynamic{Field,List} setCapacity corner case
- for the case when the newly requested capacity coincides with the
current addressable size, the resize of the underlying list would have
been bypassed - ie, the real capacity was not actually changed.
- remove (unused) PtrDynList setCapacity method as too fragile
- previously returned the range slice as a UList,
but this prevents convenient assignment.
Apply similar handling for Field/SubField
Allows the following
labelRange range(...);
fullList.slice(range) = identity(range.size());
and
fullList.slice(range) = UIndirectList<T>(other, addr);
ENH: create SubList from full FixedList (simplifies interface)
- allow default constructed SubList. Use shallowCopy to 'reset' later
- support simple mesh subsetting to vtu formats to enable debug output
for a subsection of an existing polyMesh
- rudimentary support for VTK from cellShapes is intended for handling
basic primitive cell shapes without a full blown polyMesh description.
For example,
// Create an empty polyMesh with points only
polyMesh debugMesh
(
io,
std::move(points),
faceList(), // faces
labelList(), // owner
labelList(), // neighbour
true // syncPar
);
// Establish the appropriate VTK sizing for the TET shapes
vtk::vtuCells vtuCells;
vtuCells.resetShapes(debugCutTets);
vtuCells.nPoints(debugMesh.nPoints());
NOTE
Since the vtk::internalMeshWriter only uses the polyMesh reference
for the points, it is also possible to create the vtuCells
description without a pointField (or from a different mesh
description) and write out the connectivity using the pointField
from a different mesh.
- allows reuse similar to refPtr for wrapping different content.
- additional control for when contents are copied back,
instead of waiting for the adaptor to go out of scope.
Eg,
if (adaptor.active())
{
adaptor.commit();
adaptor.clear();
}
- static ConstPrecisionAdaptor::get method renamed to 'select' as a
better description of its purpose and avoid confusion with
non-static 'get' method.
Was previously only used within GAMGPreconditioner, but even there
it is better just to use the ConstPrecisionAdaptor directly.
- Can specify
```
transform
{
origin (1 2 3);
rotation none;
}
```
instead of the longer form
```
transform
{
origin (1 2 3);
e1 (1 0 0);
e3 (0 0 1);
}
```
COMPAT: remove old (pre-1812) Euler and STARCD keywords
- use "angles", remove old compat name "rotation"
as clutter and possible confusion with "coordinate rotation"
STYLE: remove coordinate rotation move constructors
- an unnecessary holdover from older code.