- the maxCommsSize variable is used to 'chunk' large data transfers
(eg, with PstreamBuffers) into a multi-pass send/recv sequence.
The send/recv windows for chunk-wise transfers:
iter data window
---- -----------
0 [0, 1*chunk]
1 [1*chunk, 2*chunk]
2 [2*chunk, 3*chunk]
...
Since we mostly send/recv in bytes, the current internal limit
for MPI counts (INT_MAX) can be hit rather quickly.
The chunking limit should thus also be INT_MAX, but since it is
rather tedious to specify such large numbers, can instead use
maxCommsSize = -1
to specify (INT_MAX-1) as the limit.
The default value of maxCommsSize = 0 (ie, no chunking).
Note
~~~~
In previous versions, the number of chunks was determined by the
sender sizes. This required an additional MPI_Allreduce to establish
an overall consistent number of chunks to walk. This additional
overhead each time meant that maxCommsSize was rarely actually
enabled.
We can, however, instead rely on the send/recv buffers having been
consistently sized and simply walk through the local send/recvs until
no further chunks need to be exchanged. As an additional enhancement,
the message tags are connected to chunking iteration, which allows
the setup of all send/recvs without an intermediate Allwait.
ENH: extend UPstream::probeMessage to use int64 instead of int for sizes
- "buffered" corresponds to MPI_Bsend (buffered send),
whereas the old name "blocking" is misleading since the
regular MPI_Send also blocks until completion
(ie, buffer can be reused).
ENH: IPstream::read() returns std::streamsize instead of label (#3152)
- previously returned a 'label' but std::streamsize is consistent with
the input parameter and will help with later adjustments.
- use <label> instead of <int> for internal accounting of the message
size, for consistency with the underyling List<char> buffers used.
- improve handling for corner case of IPstream receive with
non-blocking, although this combination is not used anywhere
- was previously limited to 'char' whereas gatherv/scatterv
already supported various integer and float types
STYLE: rebundle allToAll declarations with macros
ENH: provide a version of allToAllConsensus returning the Map
- simplifies use and avoids ambiguities in the send/recv parameters
- the Map version will now also transmit zero value data if they exist
in the Map. Unlike the List version, zero values are not necessary to
signal connectivity with a Map.
COMP: forwarding template parameters for NBX routines
ENH: consolidate PstreamBuffers size exchange options
- had a variety of nearly identical backends for all-to-all,
gather/scatter. Now combined internally with a dispatch enumeration
which provides better control over which size exchange algorithm
is used.
DEFEATURE: remove experimental full-NBX PstreamBuffers variant
- no advantages seen compared to the hybrid NBX/PEX approach.
Removal reduces some code cruft.
DEFEATURE: remove experimental "double non-blocking" NBX version
- the idea was to avoid blocking receives for very large data transfers,
but that is usually better accomplished with a hybrid NBX/PEX approach
like PstreamBuffers allows
- previously had an additional stack for freedRequests_,
which were used to 'remember' locations into the list of
outstandingRequests_ that were handled by 'waitRequest()'.
This was principally done for sanity checks on shutdown,
but we now just test for any outstanding requests that
are *not* MPI_REQUEST_NULL instead (much simpler).
The framework with freedRequests_ also had a provision to 'recycle'
them by popping from that stack, but this is rather fragile since it
would only triggered by some collectives
(MPI_Iallreduce, MPI_Ialltoall, MPI_Igather, MPI_Iscatter)
with no guarantee that these will all be properly removed again.
There was also no pruning of extraneous indices.
ENH: consolidate internal reset/push of requests
- replace duplicate code with inline functions
reset_request(), push_request()
ENH: null out trailing requests
- extra safety (paranoia) for the UPstream::Request versions
of finishedRequests(), waitAnyRequest()
CONFIG: document nPollProcInterfaces in etc/controlDict
- still experimental, but at least make the keyword known
- simplifies code, consistent with other matrix transfer functions.
Use a setter method.
STYLE: AMIInterpolation::upToDate(bool) setter method
ENH: add guards to avoid float-compressed transfer of integral types
STYLE: drop unused debug member from abstract interface classes
- since ensight format is always float and also always written
component-wise, perform the double -> float narrowing when
extracting the components. This reduces the amount of data
transferred between processors.
ENH: avoid vtk/ensight parallel communication of empty messages
- since ensight writes by element type (eg, tet, hex, polyhedral) the
individual written field sections will tend to be relatively sparse.
Skip zero-size messages, which should help reduce some of the
synchronization bottlenecks.
ENH: use 'data chunking' when writing ensight files in parallel
- since ensight fields are written on a per-element basis, the
corresponding segment can become rather sparsely distributed. With
'data chunking', we attempt to get as many send/recv messages in
before flushing the buffer for writing. This should make the
sequential send/recv less affected by the IO time.
ENH: allow use of an external buffer when writing ensight components
STYLE: remove last vestiges of autoPtr<ensightFile> for output routines
- the very old 'writer' class was fully stateless and always templated
on an particular output type.
This is now replaced with a 'coordSetWriter' with similar concepts
as previously introduced for surface writers (#1206).
- writers change from being a generic state-less set of routines to
more properly conforming to the normal notion of a writer.
- Parallel data is done *outside* of the writers, since they are used
in a wide variety of contexts and the caller is currently still in
a better position for deciding how to combine parallel data.
ENH: update sampleSets to sample on per-field basis (#2347)
- sample/write a field in a single step.
- support for 'sampleOnExecute' to obtain values at execution
intervals without writing.
- support 'sets' input as a dictionary entry (as well as a list),
which is similar to the changes for sampled-surface and permits use
of changeDictionary to modify content.
- globalIndex for gather to reduce parallel communication, less code
- qualify the sampleSet results (properties) with the name of the set.
The sample results were previously without a qualifier, which meant
that only the last property value was actually saved (previous ones
overwritten).
For example,
```
sample1
{
scalar
{
average(line,T) 349.96521;
min(line,T) 349.9544281;
max(line,T) 350;
average(cells,T) 349.9854619;
min(cells,T) 349.6589286;
max(cells,T) 350.4967271;
average(line,epsilon) 0.04947733869;
min(line,epsilon) 0.04449639927;
max(line,epsilon) 0.06452856475;
}
label
{
size(line,T) 79;
size(cells,T) 1720;
size(line,epsilon) 79;
}
}
```
ENH: update particleTracks application
- use globalIndex to manage original parcel addressing and
for gathering. Simplify code by introducing a helper class,
storing intermediate fields in hash tables instead of
separate lists.
ADDITIONAL NOTES:
- the regionSizeDistribution largely retains separate writers since
the utility of placing sum/dev/count for all fields into a single file
is questionable.
- the streamline writing remains a "soft" upgrade, which means that
scalar and vector fields are still collected a priori and not
on-the-fly. This is due to how the streamline infrastructure is
currently handled (should be upgraded in the future).
- if the system/decomposeParDict is missing, skip check for matching
number of processor dirs. This can make job dispatch easier.
Does not apply if -decomposeParDict was explicitly specified.
STYLE: adjust naming of host/slaves in argList
- previously always called dlclose on opened libraries when destroying
the dlLibraryTable. However, by force closing the libraries the
situation can arise that the library is missing its own code that it
needs on unload (#1524). This is also sometimes evident when closing
VTK libraries for runTimePostProcessing (#354, #1585).
- The new default is to not forcibly dlclose any libraries, unless
the dlcloseOnTerminate OptimisationSwitch specifies otherwise.
- The dlLibraryTable::close() method can be used to explicitly close
all libraries and clear the list.
- The dlLibraryTable::clear() method now only clears the entries,
without a dlclose.
- timeVaryingUniformFixedValue -> uniformFixedValue
- allows a variety of functions (eg, coded, expressions, tables, ...)
- more similarity to finiteVolume patch type
STYLE: remove unused timeVarying... from etc/controlDict
- remove (unused) Istream constructors, prune some unused methods,
rationalize write() vs writeDict().
Deprecate inconsistent construction order.
- handle empty names for ".ftr" surface patches (for plain triSurface
format) with double-quoted strings for more reliable streaming.
Written on a single line.
This is _backward_ compatible, but if users have been parsing these
files manually, they will need to adjust their code.
Previously:
```
(
frt-fairing:001%1
empty
windshield:002%2
empty
...
)
```
Updated (with example handling of empty name):
```
(
frt-fairing:001%1 empty
windshield:002%2 ""
...
)
```
- now use debug 2 for scanner and debug 4 for parser.
Provided better feedback about what is being parsed (debug mode)
- relocate debug application to applications/tools/foamExprParserInfo
- this cannot be left as a configurable value (on windows), since it
needs to be enabled even prior to reading the etc/controlDict file,
in case the OpenFOAM installation path itself contains spaces.
- can be used to check the validity of input values.
Example:
dict.getCheck<label>("nIters", greaterOp1<label>(0));
dict.getCheck<scalar>("relax", scalarMinMax::zero_one());
- use 'get' prefix for more regular dictionary methods.
Eg, getOrDefault() as alternative to lookupOrDefault()
- additional ops for convenient construction of predicates
ENH: make dictionary writeOptionalEntries integer
- allow triggering of Fatal if default values are used
ENH: additional scalarRange static methods: ge0, gt0, zero_one
- use GREAT instead of VGREAT for internal placeholders
- additional MinMax static methods: gt, le
- having whitespace in fileName can be somewhat fragile since it means
that the fileName components do not necessarily correspond to a
'Foam::word'. But in many cases it will work provided that spaces
are not present in the final portion of the simulation directory
itself.
InfoSwitches
{
// Allow space character in fileName (use with caution)
allowSpaceInFileName 0;
}
- now use doClean=true as default for fileName::validate(). Was false.
Unlike fileName::clean() this requires no internal string rewrite
since the characters are being copied. Also handle any path
separator transformations (ie, backslash => forward slash) at the
same time. This makes it resemble the std::filesystem a bit more.
- Extended runTimePostProcessing to include access to "live"
simulation objects such a geometry patches and sampled surfaces
stored on the "functionObjectObjects" registry.
- Add 'live' runTimePostProcessing of cloud data.
Extracts position and fields from the cloud via its objectRegistry writer
- For the "live" simulation objects, there are two new volume filters
that work directly with the OpenFOAM volume fields:
* iso-surface
* cutting planes
Both use the VTK algorithms directly and support multiple values.
Eg, can make multiple iso-levels or multiple planes parallel to each
other.
- When VTK has been compiled with MPI-support, parallel rendering will
be used.
- Additional title text properties (shadow, italic etc)
- Simplified handling of scalar-bar and visibility switches
- Support multiple text positions. Eg, for adding watermark text.
- changed the sectorCoeffs keyword to 'point' from 'axisPt'
for more similarity with other dictionaries.
Continue to accept 'axisPt' for compatibility.
- this helps for many cases outlined in issue #1007, but can also be
useful when simply using symlinks for shorter or reorganized
directory structures.
- with the 'cwd' optimization switch it is possible to select the
preferred behaviour for the cwd() function.
A value of 0 causes cwd() to return the physical directory,
which is what getcwd() and `pwd -P` return.
Until now, this was always the standard behaviour.
With a value of 1, cwd() instead returns the logical directory,
which what $PWD contains and `pwd -L` returns.
If any of the sanity checks fail (eg, PWD points to something other
than ".", etc), a warning is emitted and the physical cwd() is
returned instead.
Apart from the optical difference in the output, this additional
control helps workaround file systems with whitespace or other
characters in the directory that normally cause OpenFOAM to balk.
Using a cleaner symlink elsewhere should skirt this issue.
Eg,
cd $HOME
ln -s "/mounted volume/user/workdir" workdir
cd workdir
# start working with OpenFOAM
Previously the coordinate system functionality was split between
coordinateSystem and coordinateRotation. The coordinateRotation stored
the rotation tensor and handled all tensor transformations.
The functionality has now been revised and consolidated into the
coordinateSystem classes. The sole purpose of coordinateRotation
is now just to provide a selectable mechanism of how to define the
rotation tensor (eg, axis-angle, euler angles, local axes) for user
input, but after providing the appropriate rotation tensor it has
no further influence on the transformations.
--
The coordinateSystem class now contains an origin and a base rotation
tensor directly and various transformation methods.
- The origin represents the "shift" for a local coordinate system.
- The base rotation tensor represents the "tilt" or orientation
of the local coordinate system in general (eg, for mapping
positions), but may require position-dependent tensors when
transforming vectors and tensors.
For some coordinate systems (currently the cylindrical coordinate system),
the rotation tensor required for rotating a vector or tensor is
position-dependent.
The new coordinateSystem and its derivates (cartesian, cylindrical,
indirect) now provide a uniform() method to define if the rotation
tensor is position dependent/independent.
The coordinateSystem transform and invTransform methods are now
available in two-parameter forms for obtaining position-dependent
rotation tensors. Eg,
... = cs.transform(globalPt, someVector);
In some cases it can be useful to use query uniform() to avoid
storage of redundant values.
if (cs.uniform())
{
vector xx = cs.transform(someVector);
}
else
{
List<vector> xx = cs.transform(manyPoints, someVector);
}
Support transform/invTransform for common data types:
(scalar, vector, sphericalTensor, symmTensor, tensor).
====================
Breaking Changes
====================
- These changes to coordinate systems and rotations may represent
a breaking change for existing user coding.
- Relocating the rotation tensor into coordinateSystem itself means
that the coordinate system 'R()' method now returns the rotation
directly instead of the coordinateRotation. The method name 'R()'
was chosen for consistency with other low-level entities (eg,
quaternion).
The following changes will be needed in coding:
Old: tensor rot = cs.R().R();
New: tensor rot = cs.R();
Old: cs.R().transform(...);
New: cs.transform(...);
Accessing the runTime selectable coordinateRotation
has moved to the rotation() method:
Old: Info<< "Rotation input: " << cs.R() << nl;
New: Info<< "Rotation input: " << cs.rotation() << nl;
- Naming consistency changes may also cause code to break.
Old: transformVector()
New: transformPrincipal()
The old method name transformTensor() now simply becomes transform().
====================
New methods
====================
For operations requiring caching of the coordinate rotations, the
'R()' method can be used with multiple input points:
tensorField rots(cs.R(somePoints));
and later
Foam::transformList(rots, someVectors);
The rotation() method can also be used to change the rotation tensor
via a new coordinateRotation definition (issue #879).
The new methods transformPoint/invTransformPoint provide
transformations with an origin offset using Cartesian for both local
and global points. These can be used to determine the local position
based on the origin/rotation without interpreting it as a r-theta-z
value, for example.
================
Input format
================
- Streamline dictionary input requirements
* The default type is cartesian.
* The default rotation type is the commonly used axes rotation
specification (with e1/e2/3), which is assumed if the 'rotation'
sub-dictionary does not exist.
Example,
Compact specification:
coordinateSystem
{
origin (0 0 0);
e2 (0 1 0);
e3 (0.5 0 0.866025);
}
Full specification (also accepts the longer 'coordinateRotation'
sub-dictionary name):
coordinateSystem
{
type cartesian;
origin (0 0 0);
rotation
{
type axes;
e2 (0 1 0);
e3 (0.5 0 0.866025);
}
}
This simplifies the input for many cases.
- Additional rotation specification 'none' (an identity rotation):
coordinateSystem
{
origin (0 0 0);
rotation { type none; }
}
- Additional rotation specification 'axisAngle', which is similar
to the -rotate-angle option for transforming points (issue #660).
For some cases this can be more intuitive.
For example,
rotation
{
type axisAngle;
axis (0 1 0);
angle 30;
}
vs.
rotation
{
type axes;
e2 (0 1 0);
e3 (0.5 0 0.866025);
}
- shorter names (or older longer names) for the coordinate rotation
specification.
euler EulerRotation
starcd STARCDRotation
axes axesRotation
================
Coding Style
================
- use Foam::coordSystem namespace for categories of coordinate systems
(cartesian, cylindrical, indirect). This reduces potential name
clashes and makes a clearer declaration. Eg,
coordSystem::cartesian csys_;
The older names (eg, cartesianCS, etc) remain available via typedefs.
- added coordinateRotations namespace for better organization and
reduce potential name clashes.
- corrected the mass based correction and updated the misleading function
arguments
- moved the option to the optimisation switches, e.g.:
OptimisationSwitches
{
experimentalDdtCorr 1;
}
- default remains off/no (0)