- allows for simpler unpacking of a full list, or list range into any
sufficiently large integral type.
For example,
processorPolyPatch pp = ...;
UOPstream toNbr(pp.neighbProcNo(), pBufs);
toNbr << faceValues.unpack<char>(pp.range());
- The bitSet class replaces the old PackedBoolList class.
The redesign provides better block-wise access and reduced method
calls. This helps both in cases where the bitSet may be relatively
sparse, and in cases where advantage of contiguous operations can be
made. This makes it easier to work with a bitSet as top-level object.
In addition to the previously available count() method to determine
if a bitSet is being used, now have simpler queries:
- all() - true if all bits in the addressable range are empty
- any() - true if any bits are set at all.
- none() - true if no bits are set.
These are faster than count() and allow early termination.
The new test() method tests the value of a single bit position and
returns a bool without any ambiguity caused by the return type
(like the get() method), nor the const/non-const access (like
operator[] has). The name corresponds to what std::bitset uses.
The new find_first(), find_last(), find_next() methods provide a faster
means of searching for bits that are set.
This can be especially useful when using a bitSet to control an
conditional:
OLD (with macro):
forAll(selected, celli)
{
if (selected[celli])
{
sumVol += mesh_.cellVolumes()[celli];
}
}
NEW (with const_iterator):
for (const label celli : selected)
{
sumVol += mesh_.cellVolumes()[celli];
}
or manually
for
(
label celli = selected.find_first();
celli != -1;
celli = selected.find_next()
)
{
sumVol += mesh_.cellVolumes()[celli];
}
- When marking up contiguous parts of a bitset, an interval can be
represented more efficiently as a labelRange of start/size.
For example,
OLD:
if (isA<processorPolyPatch>(pp))
{
forAll(pp, i)
{
ignoreFaces.set(i);
}
}
NEW:
if (isA<processorPolyPatch>(pp))
{
ignoreFaces.set(pp.range());
}
- use succincter method names that more closely resemble dictionary
and HashTable method names. This improves method name consistency
between classes and also requires less typing effort:
args.found(optName) vs. args.optionFound(optName)
args.readIfPresent(..) vs. args.optionReadIfPresent(..)
...
args.opt<scalar>(optName) vs. args.optionRead<scalar>(optName)
args.read<scalar>(index) vs. args.argRead<scalar>(index)
- the older method names forms have been retained for code compatibility,
but are now deprecated
The compact ASCII format is a block of index/value tuples for the
non-zero entries:
{ (index1 value1) (index2 value2) (index3 value3) }
For PackedList<1>, and thus PackedBoolList, the compact ASCII format is
a block of indices for the non-zero entries:
{ index1 index2 index3 }
Thus either of the following could be used - for PackedList<2>:
- a list of all values:
16(0 3 0 2 0 0 3 1 0 0 0 0 0 0 0 1)
- a block of index/value tuples:
{(1 3) (3 2) (7 3) (8 1) (15 1)}
For PackedList<1> and PackedBoolList, either of the following could be
used:
- a list of all values, using any valid bool representation:
16(0 1 0 true 0 0 t 1 0 n n 0 0 0 0 yes)
- a block of the indices for non-zero entries:
{1 3 7 8 15}
- now that I re-examined the code, the note in commit 51fd6327a6
can be mostly ignored
PackedList isMaster(nPoints, 1u);
is not really inefficient at all, since the '1u' is packed into
32/64-bits before the subsequent assignment and doesn't involve
shifts/masking for each index
The same misinformation applies to the PackedList(size, 0u) form.
It isn't much slower at all.
Nonetheless, add bool specialization so that it is a simple assign.
- use shift-right instead of shift-left formulation to avoid wrong behaviour
with non-optimized compilation when the packed items fit exactly in the
available number of bits.
- compare iteratorBase == iteratorBase by value, not position
thus this works
list[a] == list[b] ...
- compare iterator == iteratorBase and const_iterator == iteratorBase
by position, not value. The inheritance rules means that this works:
iter == list.end() ...
this will compare positions:
iter == list[5];
Of course, this will still compare values:
*iter == list[5];
- OSspecific: chmod() -> chMod(), even although it's not used anywhere
- ListOps get subset() and inplaceSubset() templated on BoolListType
- added UList<bool>::operator[](..) const specialization.
Returns false (actually pTraits<bool>::zero) for out-of-range elements.
This lets us use List<bool> with lazy evaluation and no noticeable
change in performance.
- use rcIndex() and fcIndex() wherever possible.
Could check if branching or modulus is faster for fcIndex().
- UList and FixedList get 'const T* cdata() const' and 'T* data()' members.
Similar to the STL front() and std::string::data() methods, they return a
pointer to the first element without needing to write '&myList[0]', recast
begin() or violate const-ness.
- moving back to original flat addressing in iterators means there is no
performance issue with using lazy evaluation
- set() method now has ~0 for a default value.
We can thus simply write 'set(i) to trun on all of the bits.
This means we can use it just like labelHashSet::set(i)
- added flip() method for inverting bits. I don't know where we might need
it, but the STL has it so we might as well too.
- dropped auto-vivification for now (performance issue), but reworked to
allow easy reinstatement
- derived both iterator and const_iterator from iteratorBase and use
iteratorBase as our proxy for non-const access to the list elements.
This allows properly chaining assignments:
list[1] = list[2];
list[1] = list[2] = 10;
- assigning iterators from iteratorBase or other iterators works:
iterator iter = list[20];
- made template parameter nBits=1 the default
- the bit counting is relatively fast:
under 0.2 seconds for 1M bits counted 1000 times
- trim()'ing the final zero elements tested for a few cases,
but might need more attention
- added Mattijs' speed tests
- optimized resize() and assignment operators to avoid set() method
- add const_iterator and re-did the proxy handling.
Reading/writing by looping across iterators is still somewhat slow, but
might be acceptable.
- eliminated previous PackedBitRef class, the iterator does all of that and
can also be used to (forward) traverse the list
- no const_iterator yet
- Note that PackedList is also a bit like DynamicList in terms of storage
management and the append() method. Since the underlying storage in
integer, any auto-vivified elements will also flood-fill the gaps with
zero.