The xso::partition Class

Introduction

The xso::partition class partitions a single stream of random numbers into a collection of non-overlapping sub-streams. This is primarily useful for setting up parallel computations.

Declaration

The partition class is in the xso namespace and is declared as follows:

namespace xso {
    template<typename RNG> class partition;
}

Template Parameter

The template parameter RNG should satisfy the requirements of a State type. In particular, it should have a class method characteristic_coefficients that works.

Instance Methods

The xso::partition class has just a couple of instance methods:

Construction

partition(const State& gen, std::size_t n_partitions);

Here, gen is a parent generator or State, and we want to partition its random state stream into n_partitions non-overlapping sub-streams. Each of those will be owned by a copy of gen, seeded with the correct starting state for the sub-stream.

The constructor computes an appropriate jump size \(J\) that depends on the n_partitions argument. It then uses the State characteristic coefficients to compute an appropriate jump polynomial for jumps of size \(J\).

The next() Method

State next();

This returns a new State that is a copy of the parent gen but seeded at the start of the next partition. The returned RNG will be the same as gen, with its state jumped along by some multiple of \(J\) steps

Example

The xso::partition class is best illustrated with a sketch example.

Example: Setting up for parallel processing

#include <xoshiro.h>
int main()
{
1    std::size_t     n_cores = available_cores(...);
2    xso::rng        gen;
3    xso::partition  partition(gen, n_cores);
    for(std::size_t job = 0; job < n_cores; ++job) {
4        spawn_job(partition.next(), ...);
    }
    ...
}
1
Some method that returns the number of compute cores we have available.
2
Create a randomly seeded parent generator.
3
This constructs a partition of the parent state stream into n_cores non-overlapping pieces.
4
And we’re off spawning an independent job for each core, handing each its own sub-stream generator to work with.

In the sketch, we first determine how many cores are available for the simulation. That may be a fixed number, or some function determines the number on the fly. For the sake of this example, suppose that n_cores comes back as 100.

We want to partition the random number stream from gen into 100 pieces and hand each core its own sub-stream to work with. So, we set up an xso::partition object to do just that.

The loop spawns jobs where each one is given its own random number generator. The generators come from calling the partition.next() method. The xso::partition class ensures that these generators are identical to the original `gen, except that they are seeded appropriately far along to the next sub-stream.

The parent generator has 256 bits of state, so its random state stream has \(2^{256}\) slots. This means that all the sub-streams still have a vast number of random deviates to use up before there is any chance of getting overlaps.

Implementation Note

In the example above, we use a random number generator xso::rng, which has 256 state bits. Those generators will step through an enormous \(2^{256}\) states before repeating the cycle.

We have 100 cores available for parallel computation. Ideally, we want to partition the random number stream from gen into 100 pieces and hand each core its own non-overlapping sub-stream of size: \[ \frac{2^{256}}{100}. \] However, that number will overflow any normal 32-bit or 64-bit integer! Instead, the partition class will create \(N =2^p\) sub-streams, where \(N\) is a power of two close to, but larger than 100. In this case, that power will be 7 as \(2^7 = 128 \ge 100\).

The number of created non-overlapping partitions is 128 instead of 100.

We only use 100 of the 128 as we spawn independent jobs, so each partition is therefore slightly smaller than the ideal, and of course, we are not using any of the final 28 partitions at all. However, each of the sub-streams still has some \(2^{256 - 7} = 2^{249}\) available random numbers. That is vastly more than a job on any present-day computer will ever consume, so the “wastage” is completely insignificant.

This class requires the State to have its pre-computed* characteristic coefficients available. This is true for all our type aliased recommended generators.

See Also

characteristic_coefficients
jump_coefficients
jump
discard

jump technique

Back to top