GF2++
Loading...
Searching...
No Matches
The BitVec Class

Introduction

A gf2::BitVec is a dynamically sized vector of bit elements stored compactly in an array of unsigned integer words.

The class satisfies the gf2::BitStore concept, which provides a rich API for manipulating the bits in the vector. The free functions defined for that concept are also pulled into the class as member functions. For example, if v is a gf2::BitVec, you can call v.count_ones() to count the number of set bits in the vector instead of calling the free function gf2::count_ones(v), though both forms are valid.

In addition to the many methods that mirror a corresponding utility function defined in BitStore.h, the gf2::BitVec class provides methods to construct bit-vectors in various ways, methods to resize a vector, and methods to append or remove elements from the end of a vector.

In mathematical terms, a bit-vector is a vector over GF2, the simplest Galois-Field with just two elements, usually denoted 0 & 1, as the booleans true & false, or as the bits set & unset. Arithmetic over GF(2) is mod 2, so addition/subtraction becomes the XOR operation while multiplication/division becomes AND.

Note
Operations on and between bit-vectors and other objects in the gf2 library are implemented using bitwise operations on whole underlying words at a time. These operations are highly optimised in modern CPUs, allowing for fast computation even on large bit-vectors. It also means we never have to worry about overflows or carries as we would with normal integer arithmetic.

The gf2::BitVec class is a hybrid between a std::vector and a std::bitset, along with extra mathematical features to facilitate numerical work, and in particular, linear algebra.

One can dynamically resize a BitVec as needed. Contrast this to a std::bitset, which has a fixed size determined at compile time. Boost provides the boost::dynamic_bitset class, which allows runtime resizing, as its name suggests. However, neither class supports algebraic operations.

It is worth noting that a BitVec is printed in vector order. For example, a bit-vector of size four will print as \(v_0 v_1 v_2 v_3\) with the elements in increasing index order, so the least significant vector element, \(v_0\), comes first on the left. Contrast that to a std::bitset, which always prints in bit-order. The equivalent std::bitset with four elements prints as \(b_3 b_2 b_1 b_0\) with the least significant bit \(b_0\) printed last on the right.

Of course, for many applications, printing in bit-order makes perfect sense. A bit vector of four elements initialised to 0x1 will print 1000. A std::bitset prints the same value as 0001, which will be more natural in some settings.

However, our main aim is numerical work, where vector order is more natural. In particular, bit-order is unnatural for matrices over GF(2). It is too confusing to print a matrix in any order other than the one where the (0,0) element is at the top left, and proceed from there.

Declaration

The declaration of the bit-vector class looks like:

template <Unsigned Word = usize>
class BitVec {
public:
using word_type = Word;
// ...
};

The Word template parameter specifies the underlying gf2::Unsigned integer type used to hold the vector's bits, and the default is usually the most efficient type for the target platform. On most modern platforms, that usize will be a 64-bit unsigned integer. The choice of underlying word type is exposed via the public word_type type alias as required by the gf2::BitStore concept.

If your application calls for a vast number of bit-vectors with only a few bits each, you might consider using std::uint8_t as the Word type to save memory.

Note
If you peruse the header files, you may notice that many of the doctests in the library use 8-bit underlying words. The reason is we want to exercise the various functions across word boundaries for modest, easily readable bit-stores.

Methods Overview

The gf2::BitVec class provides a rich set of methods for constructing, querying, and manipulating, bit-vectors. Here is an overview of the main methods available in the class:

Category Description
Concept Methods Methods needed to satisfy the gf2::BitStore concept.
Constructors Methods to create bit-vectors of various sizes and initial values.
Factory Constructors Class methods to create bit-vectors with specific properties or fills.
Construction from Strings Class methods to create bit-vectors from string representations.
Resizing Methods to query and manipulate the size and capacity of a bit-vector.
Appending Elements Methods to append bits from various sources to the end of a bit-vector.
Removing Elements Methods to remove bits from the end of a bit-vector.
Bit Access Methods to access individual bit elements in a bit-vector.
Queries Methods to query the overall state of a bit-vector.
Mutators Methods to mutate the overall state of a bit-vector.
Fills Methods to fill a bit-vector from various sources.
Spans Methods to create non-owning views over a part of a bit-vector — bit-spans.
Sub-vectors Methods to pull out a clone of piece of a bit-vector as new bit-vectors
Riffling Methods to create vectors that copy a bit-vector with interleaved zeros.
Set/Unset Indices Methods to find the indices of set & unset bits in a bit-vector.
Iterators Methods to create various iterators over a bit-vector.
Stringification Methods to create string representations of a bit-vector.
Equality Operator Operator to compare bit-stores, including bit-vectors for content equality.
Bit Shifts Operators to shift the bits in bit-vectors left or right.
Bit-wise Operators Operators to combine bit-stores using logical operations.
Arithmetic Operators Operators to add or subtract bit-stores.
Other Functions Dot products, convolutions, concatenation etc. for bit-vectors.

Concept Methods

Bit-vectors satisfy the gf2::BitStore concept and forwards many method calls to free functions defined for that concept. The concept requires us to provide the following methods:

Method Name Description
gf2::BitVec::size Returns the number of bit elements in the bit-vector.
gf2::BitVec::words Returns the number of Words needed to vector those elements.
gf2::BitVec::word Returns the word for the passed index.
gf2::BitVec::set_word Sets the word at the passed index to the passed value.
gf2::BitVec::store const Returns a const pointer to the first Word holding bits in the vector.
gf2::BitVec::store Returns a non-const pointer to the first Word holding bits in the vector.
gf2::BitVec::offset For bit-vectors, this always returns 0.

These methods are trivial to implement for bit vectors.

The one place where care is needed is in the gf2::BitVec::set_word method, which must ensure that any bits beyond the size of the bit-vector remain set to zero.

Warning
While the gf2::BitVec::store method provides write access to the underlying words, you must be careful when modifying them directly. You must ensure that any bits beyond the bit-vector's size remain zero. The gf2::BitVec::set_word method takes care of this for you, so prefer using that method when possible.

Constructors

The default constructor for a gf2::BitVec creates an empty bit-vector with zero size. You can also create a bit-vector of a given size using the following constructors:

Method Name Description
gf2::BitVec(usize) Constructs a bit-vector of the given length where all the elements are zero.
gf2::BitVec(usize, Word) Constructs a bit-vector by repeatedly copying the bits from the Word

Factory Constructors

There are also many static factory construction functions.

Method Name Description
gf2::BitVec::with_capacity Returns a zero-sized bit-vector that can add elements without any extra allocations.
gf2::BitVec::zeros Returns a bit-vector where all the elements are 0.
gf2::BitVec::ones Returns a bit-vector where all the elements are 1.
gf2::BitVec::constant Returns a bit-vector where all the elements are whatever is passed as a value.
gf2::BitVec::unit Returns a bit-vector where all the elements are zero except for a single 1.
gf2::BitVec::alternating Returns a bit-vector where all the elements follow the pattern 101010...
gf2::BitVec::from Returns a BitVec filled with bits from various sources.
gf2::BitVec::random Returns a BitVec with a random fill seeded from entropy.
gf2::BitVec::seeded_random Returns a BitVec with a reproducible random fill.
gf2::BitVec::biased_random Returns a random BitVec where you set the probability of bits being 1.

The gf2::BitVec::from factory method is overloaded to allow construction from:

  • Any other gf2::BitStore object. The source object need not share the same underlying storage type.
    This allows conversions from bit-stores based on a different word type.
  • Any unsigned integer type, where we copy the bits corresponding to the value.
    The source type need not be the same as the underlying Word type used by the bit-vector. The size of the resulting vector is determined by the number of bits in the source type.
  • A std::bitset where we copy the bits over.
  • A callable object (a function of some sort) that generates bits on demand when passed an index.

Construction from Strings

We can construct a gf2::BitVec from strings — these methods can fail, so they return a std::optional<gf2::BitVec> and std::nullopt if the string cannot be parsed.

Method Name Description
gf2::BitVec::from_string Tries to construct a bit-vector from an arbitrary string.
gf2::BitVec::from_binary_string Tries to construct a bit-vector from a binary string.
gf2::BitVec::from_hex_string Tries to construct a bit-vector from a hexadecimal string.

Space, comma, single quote, and underscore characters are removed from the string.

If the string has an optional "0b" prefix, it is assumed to be binary. If it has an optional "0x" prefix, it is assumed to be hex. If there is no prefix and the string consists entirely of 0s and 1s, we assume it is binary; otherwise, we think it is hex.

Warning
This means the string "0x11 is interpreted as the bit-vector of size 8 "11110001", whereas the same string without a prefix, "11" is interpreted as the bit-vector of size 2 "11". To avoid any ambiguity, it is best to use a prefix.

See the string-encodings section in the BitStore documentation for more details on the accepted string formats.

Resizing

We have methods to query and manipulate the size and capacity of a bit-vector:

Method Name Description
gf2::BitVec::size Returns the number of bit elements in the bit-vector.
gf2::BitVec::capacity Returns the total number of bits the vector can hold without allocating more memory.
gf2::BitVec::shrink_to_fit Tries to shrink the vector's capacity as much as possible.
gf2::BitVec::clear Sets the size() to zero. Leaves the capacity unaltered.
gf2::BitVec::resize Returns a const pointer to the beginning of the Word vector.
gf2::BitVec::clean Sets any unused bits in the last occupied word to 0.

The gf2::BitVec::clean method is primarily used internally in the library.

Appending Elements

We have methods to append elements from various sources to the end of a bit-vector:

Method Name Description
gf2::BitVec::push Pushes a single bit (0 or 1) onto the end of the bit-vector.
gf2::BitVec::append Appends bits from various sources to the end of the bit-vector.
gf2::BitVec::append_digit Appends a "character's" worth of bits to the end of the bit-vector.
gf2::BitVec::append_hex_digit Appends four bits from a "hex-character" to the end of the bit-vector.

The gf2::BitVec::append method is overloaded to allow appending bits from:

  • A std::bitset where we append the bits to the end of the bit-vector.
  • Any unsigned integer type, where we append the bits corresponding to the value.
    The type need not be the same as the underlying Word type used by the bit-vector.
  • Any other gf2::BitStore object, which need not share the same underlying Word storage type.

The gf2::BitVec::append_digit method appends bits from a character representing a digit in one of the bases 2, 4, 8, or 16. It does nothing if it fails to parse the character.

Removing Elements

We have methods to remove elements from the end of a bit-vector:

Method Name Description
gf2::BitVec::pop Removes the last bit from the bit-vector and returns it.
gf2::BitVec::split_off_unsigned Removes a single arbitrary-sized unsigned integer off the end of the bit-vector and returns it.
gf2::BitVec::split_off Splits a bit-vector into two at a given index

The first two methods return the removed elements as a std::optional, and as a std::nullopt if the vector is empty.

The two gf2::BitVec::split_off methods complement the gf2::BitVec::split_at methods. These versions change the size of the bit-vector in place.

Bit Access

The following methods provide access to individual bit elements in the bit-vector.

Function Description
gf2::BitVec::get Returns the value of a single bit element as a read-only boolean.
gf2::BitVec::operator[]() Returns a bool in the const case and a BitRef with read-write access to a bit element in the non-const version.
gf2::BitVec::front Returns the value of the first element in the vector.
gf2::BitVec::back Returns the value of the last element in the vector.
gf2::BitVec::set Sets a bit to the given boolean value which defaults to true.
gf2::BitVec::flip Flips the value of the bit element at a given index.
gf2::BitVec::swap Swaps the values of bit elements at locations i and j.
Note
You can set the DEBUG flag at compile time to enable bounds checks on the index arguments.

The non-const version of gf2::BitVec::operator[]() returns a gf2::BitRef, which is "reference" to an individual bit in the vector. It is automatically converted to a boolean on reads, but it also allows writes, which means you can write natural-looking single-bit assignments:

v[12] = true;

This is equivalent to calling v.set(12, true);.

Queries

The following methods let you query the overall state of a bit-vector.

Function Description
gf2::BitVec::is_empty Returns true if the vector is empty
gf2::BitVec::any Returns true if any bit in the vector is set.
gf2::BitVec::all Returns true if every bit in the vector is set.
gf2::BitVec::none Returns true if no bit in the vector is set.
gf2::BitVec::count_ones Returns the number of set bits in the vector.
gf2::BitVec::count_zeros Returns the number of unset bits in the vector.
gf2::BitVec::leading_zeros Returns the number of leading unset bits in the vector.
gf2::BitVec::trailing_zeros Returns the number of trailing unset bits in the vector.

These methods efficiently operate on words at a time, so they are inherently parallel.

Mutators

The following methods let you mutate the entire vector in a single call.

Function Description
gf2::BitVec::set_all Sets all the bits in the vector to the passed value, which defaults to true.
gf2::BitVec::flip_all Flips the values of all the bits in the vector.

The methods operate on words at a time, so are inherently parallel.

Copies & Fills

The following methods let you populate the entire vector in a single call.

Function Description
gf2::BitVec::copy Makes the bits in this vector identical to those from various sources.
gf2::BitVec::random_fill Fills the vector with random 0's and 1's.

Copies

The gf2::BitVec::copy methods support copying bits from:

  • Another bit-store of the same size but possibly a different underlying word type.
  • A std::bitset of the same size as the vector.
  • An unsigned integer that has the same number of bits as the vector. The integer type need not be the same as the underlying Word used by the bit-vector.
  • A function or callable object that takes a single usize index argument and returns a boolean value for that index.
Note
In each case, the size of the source and destinations must match exactly and that condition is always checked unless the NDEBUG flag is set at compile time. You can always use a gf2::BitSpan to copy a subset of bits if needed.

Random Fills

By default, the random fill method uses a random number generator seeded with system entropy, so the results change from run to run. You can set a specific seed to get reproducible fills.

The default probability that a bit is set is 50%, but you can pass a different probability in the range [0.0, 1.0] if desired.

Spans

The following methods let you create a gf2::BitSpan, which is a non-owning view of some contiguous subset of bits in the vector.

Function Description
gf2::BitVec::span Returns a gf2::BitSpan encompassing the bits in a half-open range [begin, end).

There are two overloads of the gf2::BitVec::span method — one for const bit-stores and one for non-const bit-stores:

auto span(usize begin, usize end); // <1>
auto span(usize begin, usize end) const; // <2>
  1. Returns a mutable gf2::BitSpan that allows modification of the bits in the specified range.
  2. Returns an immutable gf2::BitSpan that does not allow modification of the bits in the specified range.

In both cases, the begin and end arguments define a half-open range of bits in the vector.

Mutability/immutability of the returned BitSpan is deep. The span's mutability reflects that of the underlying vector, so if the vector is mutable, so is the span, and vice versa.

This is similar to the C++20 std::span class for regular data collection types.

Note
A gf2::BitSpan also satisfies the gf2::BitStore concept, so you can take a span of a span.

Sub-vectors

The following methods create or fill independent bit-vectors with copies of some contiguous subset of the bits in the vector.

Function Description
gf2::BitVec::sub Returns a new gf2::BitVec encompassing the bits in a half-open range [begin, end).
gf2::BitVec::split_at Fills two bit-vectors with the bits in the ranges [0, at) and [at, size()).

The gf2::BitVec::split_at method can optionally take two pre-existing bit-vectors to fill, thereby avoiding unnecessary allocations in some iterative algorithms that repeatedly use this method.

Note
These methods do not alter the underlying vector.

Riffling

We have methods that can interleave (riffle) the bits in a vector with zeros.

Function Description
gf2::BitVec::riffled Fills a pre-existing bit-vector with the result of riffling this vector.
gf2::BitVec::riffled Returns a new bit-vector that is this vector with its bits interleaved with zeros.

If the vector looks like \(v_0 v_1 v_2 \ldots v_n\), then the riffling operation produces the vector \(v_0 0 v_1 0 v_2 0 \ldots v_n\) where a zero is interleaved between every bit in the original vector (there is no trailing zero at the end).

If you think of a bit-vector as representing the coefficients of a polynomial over GF(2), then riffling corresponds to squaring that polynomial. See the documentation for gf2::BitPoly::squared for more information.

Set/Unset Bit Indices

The following methods find the indices of set or unset bits in the vector.

Function Description
gf2::BitVec::first_set Returns the index of the first set bit in the vector.
gf2::BitVec::last_set Returns the index of the last set bit in the vector.
gf2::BitVec::next_set Returns the index of the next set bit in the vector after the passed index.
gf2::BitVec::previous_set Returns the index of the previous set bit in the vector before the passed index.
gf2::BitVec::first_unset Returns the index of the first unset bit in the vector.
gf2::BitVec::last_unset Returns the index of the last unset bit in the vector.
gf2::BitVec::next_unset Returns the index of the next unset bit in the vector after the passed index.
gf2::BitVec::previous_unset Returns the index of the previous unset bit in the vector before the passed index.

Iterators

The following methods create iterators for traversing the bits or underlying words in the vector:

  • Read-only iteration through the individual bits.
  • Read-write iteration through the individual bits.
  • Read-only iteration through the indices of the set bits.
  • Read-only iteration through the indices of the unset bits.
  • Read-write iteration through the underlying vector words.
Function Description
gf2::BitVec::bits Returns a gf2::BitsIter iterator over the bits in the vector.
gf2::BitVec::set_bits Returns a gf2::SetBitsIter iterator to view the indices of all the set bits.
gf2::BitVec::unset_bits Returns a gf2::UnsetBitsIter iterator to view the indices of all the unset bits.
gf2::BitVec::store_words Returns a gf2::WordsIter iterator to view the "words" underlying the vector.
gf2::BitVec::to_words Returns a copy of the "words" underlying the bit-vector.

There are two overloads of the gf2::BitVec::bits method — one for const bit-stores and one for non-const bit-stores:

auto bits(); // <1>
auto bits() const; // <2>
  1. Returns a mutable gf2::BitsIter that allows modification of the bits in the vector.
  2. Returns an immutable gf2::BitsIter that only allows one to view the bits in the vector.

Stringification

The following methods return a string representation for a bit-vector. The string can be in the obvious binary format or a more compact hex format.

Function Description
gf2::BitVec::to_string Returns a default string representation for a bit-vector.
gf2::BitVec::to_pretty_string Returns a "pretty" string representation for a bit-vector.
gf2::BitVec::to_binary_string Returns a binary string representation for a bit-vector.
gf2::BitVec::to_hex_string Returns a compact hex string representation for a bit-vector.

Other Operators and Functions

There are many operators and free functions defined for any gf2::BitStore compatible class, including:

Category Description
Equality Operator Operator to compare bit-stores, including bit-vectors for content equality.
Bit Shifts Operators to shift the bits in bit-vectors left or right.
Bit-wise Operators Operators to combine bit-stores using logical operations.
Arithmetic Operators Operators to add or subtract bit-stores.
Other Functions Dot products, convolutions, concatenation etc. for bit-vectors.

See Also

  • gf2::BitVec reference for detailed documentation with examples for each method.
  • BitStore for the concept API shared by all bit-stores.
  • BitSet for fixed-size vectors of bits.
  • BitSpan for non-owning views of some of the bits in a bit-vector.
  • BitMat for matrices of bits.
  • BitPoly for polynomials over GF(2).