xso::generator — Jumps

We have a non-member jump function to rapidly move any recommended xoshiro/xoroshiro generator or State far ahead in its random number stream.

template<typename State>
void
xso::jump(State& state, const std::array<typename State::word_type, State::word_count()>& jump_coeff)
  1. This jumps the generator using the jump_coeff argument.

The template parameter State type can be either a full xso::generator or State class, as both have the methods needed for the function to succeed.

To jump \(J\) steps ahead in the stream, you first call jump_coefficients(J), which returns an appropriate set of jump polynomial coefficients that you can use as the final argument in this jump function. See the documentation for jump_coefficients.

Other Generator Variants

These xso::jump and xso::jump_coefficients functions rely on the pre-canned characteristic polynomials embedded in the xoshiro.h. All our type aliased generators have their characteristic polynomial precomputed in that header file. This means we support jumping any of those generators by arbitrary steps.

If, instead, you are using some other variant of xoshiro/xoroshiro, you can still get its characteristic polynomial and jump polynomials by employing the extra, header-only, bit library. That is a package for GF(2), which can compute the needed characteristic polynomials for the generator’s transition matrix. Once the bit library is included, then the xoshiro.h header file defines extra non-member functions that can compute those characteristic and jump polynomials.

See the documentation for these functions at bit website.

Motivation for Jumping

In theory, Monte Carlo analyses and their kin that employ random number generators are easy to parallelize.

For example, if you have 128 compute cores available, you can start 128 separate analyses, each with its own random number generator. These run in parallel, and you aggregate the independent results as they come in.

Of course, having confidence in one type of generator is hard enough, let alone 128. Therefore, we generally use a single well-understood, well-tested generator and create 128 copies started with different seeds. However, if you pick the 128 seeds “at random,” there can be significant overlaps and correlations between the resulting output streams. Those unknowable effects will pollute any conclusions you draw from the results of the supposedly independent simulations.

Instead of picking 128 different seeds at random, an alternative approach is to pick a single seed \(s_0\) and then partition the resulting stream of random numbers into 128 sub-streams.

For example, you need to run xsg::rng64() some \(2^{256}\) times before the stream repeats. That is an enormous number — a simulation on any current computer will only ever consume a small fraction of that stream. Therefore, we can break it into sub-streams and then use those across our parallel computation.

In our example, we have 128 available compute cores and have thoughtfully picked a single seed state \(s_0\) for xso::rng64. From that seed \(s_0\) there stretches a stream of \(2^{256}\) states \(\left\{s_0, s_1, s_2, \cdots \right\}\). We want to split this parent stream into 128 equal non-overlapping sub-streams. Each sub-stream will have the following size: \[ \frac{2^{256}}{128} = \frac{2^{256}}{2^7} = 2^{259} \] which is still vast!

However, to parallelize the computation, we cannot have the 128 simulators all reference a single bottleneck random number generator!

Instead, we want to create the non-overlapping sub-streams lazily by just instantiating 128 copies of the generator. The first is seeded with \(s_0\), the next is seeded some \(2^{259}\) slots along, the next another \(2^{259}\) slots along from that, and so on. Each of the 128 simulations has its generator that happens to return partitions of a single parent stream and, therefore, should have nice independence properties.

The key to making this work is to efficiently jump ahead in the state stream. In our example, we need to go from any state \(s\) to another state that is \(2^{259}\) steps along from \(s\) in the random state stream. And that is exactly what all our various jump methods and functions accomplish.

These jump methods are vastly more efficient than using the discard method. Moreover, they can accommodate jump sizes like \(N = 2^{259}\) which will overflow even the argument type for discard(N) before that method ever gets going!

See Also

characteristic_coefficients
jump_coefficients
xso::partition
bit::characteristic_polynomial
bit::polynomial::reduce

jump technique
reduction technique

Back to top