Original commit message:
------------------------
Parallel IO: New collated file format
When an OpenFOAM simulation runs in parallel, the data for decomposed fields and
mesh(es) has historically been stored in multiple files within separate
directories for each processor. Processor directories are named 'processorN',
where N is the processor number.
This commit introduces an alternative "collated" file format where the data for
each decomposed field (and mesh) is collated into a single file, which is
written and read on the master processor. The files are stored in a single
directory named 'processors'.
The new format produces significantly fewer files - one per field, instead of N
per field. For large parallel cases, this avoids the restriction on the number
of open files imposed by the operating system limits.
The file writing can be threaded allowing the simulation to continue running
while the data is being written to file. NFS (Network File System) is not
needed when using the the collated format and additionally, there is an option
to run without NFS with the original uncollated approach, known as
"masterUncollated".
The controls for the file handling are in the OptimisationSwitches of
etc/controlDict:
OptimisationSwitches
{
...
//- Parallel IO file handler
// uncollated (default), collated or masterUncollated
fileHandler uncollated;
//- collated: thread buffer size for queued file writes.
// If set to 0 or not sufficient for the file size threading is not used.
// Default: 2e9
maxThreadFileBufferSize 2e9;
//- masterUncollated: non-blocking buffer size.
// If the file exceeds this buffer size scheduled transfer is used.
// Default: 2e9
maxMasterFileBufferSize 2e9;
}
When using the collated file handling, memory is allocated for the data in the
thread. maxThreadFileBufferSize sets the maximum size of memory in bytes that
is allocated. If the data exceeds this size, the write does not use threading.
When using the masterUncollated file handling, non-blocking MPI communication
requires a sufficiently large memory buffer on the master node.
maxMasterFileBufferSize sets the maximum size in bytes of the buffer. If the
data exceeds this size, the system uses scheduled communication.
The installation defaults for the fileHandler choice, maxThreadFileBufferSize
and maxMasterFileBufferSize (set in etc/controlDict) can be over-ridden within
the case controlDict file, like other parameters. Additionally the fileHandler
can be set by:
- the "-fileHandler" command line argument;
- a FOAM_FILEHANDLER environment variable.
A foamFormatConvert utility allows users to convert files between the collated
and uncollated formats, e.g.
mpirun -np 2 foamFormatConvert -parallel -fileHandler uncollated
An example case demonstrating the file handling methods is provided in:
$FOAM_TUTORIALS/IO/fileHandling
The work was undertaken by Mattijs Janssens, in collaboration with Henry Weller.
- allows configuration without an environment variable.
For compatibility still respect FOAM_SIGFPE and FOAM_SETNAN
env-variables
- The env-variables are now treated as true/false switch values.
Previously there was just a check for env exists or not, but this
can be fairly fragile for a user's environment.
- Cleanup/centralize handling of -decomposeParDict by relocating
common code into argList. Ensures that all processes receive
identical information about the -decomposeParDict opton.
- Only use alternative decomposeParDict for simpleFoam/motorBike
tutorial so that this will be included in the test loop for snappy.
- Added Mattijs' fix for surfaceRedistributePar.
splitMeshRegions: handle flipping of faces for surface fields
subsetMesh: subset dimensionedFields
decomposePar: use run-time selection of decomposition constraints. Used to
keep cells on particular processors. See the decomposeParDict in
$FOAM_UTILITIES/parallel/decomposePar:
- preserveBaffles: keep baffle faces on same processor
- preserveFaceZones: keep faceZones owner and neighbour on same processor
- preservePatches: keep owner and neighbour on same processor. Note: not
suitable for cyclicAMI since these are not coupled on the patch level
- singleProcessorFaceSets: keep complete faceSet on a single processor
- refinementHistory: keep cells originating from a single cell on the
same processor.
decomposePar: clean up decomposition of refinement data from snappyHexMesh
reconstructPar: reconstruct refinement data (refineHexMesh, snappyHexMesh)
reconstructParMesh: reconstruct refinement data (refineHexMesh, snappyHexMesh)
redistributePar:
- corrected mapping surfaceFields
- adding processor patches in order consistent with decomposePar
argList: check that slaves are running same version as master
fvMeshSubset: move to dynamicMesh library
fvMeshDistribute:
- support for mapping dimensionedFields
- corrected mapping of surfaceFields
parallel routines: allow parallel running on single processor
Field: support for
- distributed mapping
- mapping with flipping
mapDistribute: support for flipping
AMIInterpolation: avoid constructing localPoints
To be used instead of zeroGradientFvPatchField for temporary fields for
which zero-gradient extrapolation is use to evaluate the boundary field
but avoiding fields derived from temporary field using field algebra
inheriting the zeroGradient boundary condition by the reuse of the
temporary field storage.
zeroGradientFvPatchField should not be used as the default patch field
for any temporary fields and should be avoided for non-temporary fields
except where it is clearly appropriate;
extrapolatedCalculatedFvPatchField and calculatedFvPatchField are
generally more suitable defaults depending on the manner in which the
boundary values are specified or evaluated.
The entire OpenFOAM-dev code-base has been updated following the above
recommendations.
Henry G. Weller
CFD Direct
Moved file path handling to regIOobject and made it type specific so
now every object can have its own rules. Examples:
- faceZones are now processor local (and don't search up anymore)
- timeStampMaster is now no longer hardcoded inside IOdictionary
(e.g. uniformDimensionedFields support it as well)
- the distributedTriSurfaceMesh is properly processor-local; no need
for fileModificationChecking manipulation.
- redistributePar to have almost (complete) functionality of decomposePar+reconstructPar
- low-level distributed Field mapping
- support for mapping surfaceFields (including flipping faces)
- support for decomposing/reconstructing refinement data