High Dimensionality Pseudo Random Number Generators

Here I attempt to show:

A simple one-dimension pseudo-random number generator (PRNG)
How to extend this generator to higher dimensions
How to make this generator more efficient
How this technique can be used to find very long binary maximum length sequence generators

A Simple One Dimensional PRNG

With the recurrence relation:

$x_(i+1) = a*x_i mod p$,

where $p$ is a prime number, we can produce quite random looking sequences. For example, with $a=3$ and $p=7$, we get the sequence ${1, 3, 2, 6, 4, 5}$.

We can determine $x_i$ for any $i$ without recursion by unraveling the recursive relation^[1] to get

$x_i = a^i*x_0 mod p$

where $x_0$ is the initial starting value.

A sequence that cycles through every possible number is a maximum length sequence, which for our simple one dimensional recurrence relation has a period of $p-1$. Not every value of $a$ results in a maximum length sequence. In our example only $a=3$ and $a=5$ produce maximum length sequences.

We know that the maximum length of a sequence is $p-1$ and if a sequence does not have maximum length, then $p-1$ will be divisible by the sequence non-maximum length. To find values of $a$ that produce maximum length sequences, we can generate random values of $a$ and test to see if the sequence repeats for all possible divisors of $p-1$. For example if $p=7$, then $p-1=6$, which has divisors $2$ and $3$. We only need to examine if $a^2 mod 7=1$ and $a^3 mod 7=1$ to test if the sequence repeats.

Property: Non-optimal sequences will have a period length which evenly divides into $p-1$

Lets take a look at a larger example with $p=227$. Since $p-1=226$ is divisible by $2$ and $113$, we can try random values of $a$ making sure that $a^2 mod 227!=1$ and $a^113 mod 227!=1$. We see that $a=21$ is not optimal because $21^113 mod 227 = 1$, however $a=20$ is optimal because $20^2 mod 227 = 173$ and $20^113 mod 227=226$, neither of which is $1$. Here is a graph of the entire sequence for $(20,227)$.

Figure 1. Plot of PRNG values with p=227 and a=20

While this result seems random, we can see the underlying structure by doing a phase plot, which is a plot of the pairs $(x_i, x_(i+1))$. Figure 2 below shows the results of our $(20,227)$ PRNG.

Figure 2. Phase plot of PRNG with p=227 and a=20

In this view we can see that the non-random nature of the sequence. There is no chance of most $(x_i, x_(i+1))$ pairs ever occurring. We could never have two equal values as a pair for instance such as (50, 50). This is due to the lack of memory of our recurrence relation. Each new value depends only the previous value before it. We could never have a pair like (50, 50), unless our entire sequence was nothing but 50 repeated for the entire sequence.

A Multi-Dimensional PRNG

Extending our simple PRNG into multiple dimensions can be done by considering $a$ to be a matrix ($bar A$) and $x_i$ to be a vector ($bar X_i$), i.e.

$bar X_i = bar A . bar X_(i-1) mod p$

Similar to our simple one dimenion recurrence relation, we can unwrap this equation to be

$bar X_i = bar A^i . bar X_0 mod p$

In this case, a maximum length sequence will step through all possible combinations of the vector $X_i$. For a matrix of rank $d$, there are $p^d-1$ combinations. For a maximum length sequence, the sequence MUST repeat after $p^d-1$ vector tuples, which means that for a maximum length sequence $bar A^(p^d-1) mod p=bar I$

Property: For a matrix of rank $d$, the maximum length sequence is $p^d-1$ long and matrix $bar A$ has the property $bar A^(p^d-1) mod p=bar I$

For example, consider the sequence generated by the matrix $bar A = [[2,1],[3,3]]$, and prime $p=7$ Starting with $X_i=[0,1]$, we get the sequence

(0,1), (1, 3), (5, 5), (1, 2), (4, 2), (3, 4), (3, 0), (6, 2), (0, 3), (3, 2), (1, 1), (3, 6), (5, 6), (2, 5), (2, 0), (4, 6), (0, 2), (2, 6), (3, 3), (2, 4), (1, 4), (6, 1), (6, 0), (5, 4), (0, 6), (6, 4), (2, 2), (6, 5), (3, 5), (4, 3), (4, 0), (1, 5), (0, 4), (4, 5), (6, 6), (4, 1), (2, 1), (5, 2), (5, 0), (3, 1), (0, 5), (5, 1), (4, 4), (5, 3), (6, 3), (1, 6), (1, 0), (2, 3)

Every possible number pair except (0,0) occurs in this sequence. As before, we can test for non-optimal sequences by testing that $bar A^x mod p != bar I$ for all possible $x in$ {divisors of $p^d-1$}

Property: Non-optimal vector sequences will have a period length which evenly divides into $p^d-1$. We can test our matrix $bar A$ by making sure that $bar A^x mod p != bar I$ for all possible $x in$ {divisors of $p^d-1$}.

Testing for all possible divisors

What does testing all possible divisors mean? I want to emphasize that all divisors means more that just testing all factors. It means testing all factors as well as the unique product of all possible combinations of factors. Consider the number of divisors for a rank 3 matrix and prime $p=227$. The factors of $227^3-1$ are 2, 73, 113, and 709. The possible divisors are then 2, 73, 113, 146, 226, 709, 1418, 8249, 16498, 51757, 80117, 103514, 160234, 5848541. Since combinations of factors grows exponentially, we want to be careful in our selection of matrix rank and prime number combinations, especially if we are working with large primes and matrix ranks. If we choose poorly, then the number of tests required may be prohibitive. [Anecdotally, I have found that matrices that have a non-prime rank tend to have more factors, so I stick with matrices with a prime rank]. If each factor is unique i.e. a number is not a factor more than once, then the number of possible divisors will grow as $sum_(i=1)^(n-1) (n!)/(i!(n-i)!)$. For example, a $p^d-1$ number with 20 unique factors would require over 1 million tests for every matrix candidate.

Sequence autocorrelation

While an optimal sequence never repeats for $p^d-1$ elements, it is correlated for every $(p^d-1)/(p-1)$ segments. It repeats as a constant factor mod p every $(p^d-1)/(p-1)$ elements. We can see this if we do an auto-correlation over the entire sequence. For the sequence $bar A = [[4, 6, 1], [6, 0, 2], [3, 6, 0]]$, and $p=7$, there are autocorrelation spikes every $(p^d-1)/(p-1)=(7^3-1)/(7-1)=57$ elements.

Figure 3. Autocorrelation for bar A = [[4, 6, 1], [6, 0, 2], [3, 6, 0]], and p=7

While the sequence does not repeat, this correlation is undesirable and the practical maximum length of our sequence is $(p^d-1)/(p-1)$ elements long. For a large prime $p$, the effective random sequence length can be approximated as $p^(d-1)$

Property: For a matrix of rank $d$, the practical maximum length sequence is $(p^d-1)/(p-1) approx p^(d-1)$ elements long.

Optimal sequence finding algorithm

The procedure for finding an optimal matrix $A$ for prime $p$ is:

populate matrix $A$ with random values between $0$ and $p$.

Test that $bar A^(p^d-1) mod p=bar I$. If not then go back to step 1.

Test that $bar A^(x) mod p!=bar I$ for all possible $x$ divisors of $p^d-1$. If not then go back to step 1.

Applying these steps I found this rank 3 matrix for prime 227

$bar A = [[125, 192, 139], [223, 27, 176], [198, 181, 157]]$

With this small prime we have a practical maximum period of $227^(3-1) = 51529$ numbers between 0 and 226

Here is an example graph where I do a phase plot of the first element of vector $X_i$ for $p=227$ for the first 10,000 values.

Figure 4. Phase plot bar A = [[125, 192, 139], [223, 27, 176], [198, 181, 157]], and p=227

A more efficient version

Ideally we would like use large matrices in order to achieve high randomness 'complexity' and long maximum sequence length, however the number of multiplications necessary to generate a sequence grows as $d^2$ i.e. complexity is $O(n^2)$. We can do better if we constrain our matrices to be of the form

$[[a_1, a_2, a_3, a_4, ..., a_(n-1), a_n], [1,0,0,0,...,0,0], [0,1,0,0,...,0,0],[0,0,1,0,...,0,0], [...,...,...,...,...,...,...], [0,0,0,0,...,1,0]]$

In matrix form we can still use the techniques that we used previously to find optimal sequences, however the number of multiplies now only grows as $d$ i.e. complexity is $O(n)$.

Once we have found our coeffiences $a_i$, we can calculate our sequence as the recurrence relation:

$x_i=sum_(j=1)^d a_j * x_(i-j)$

For example for $d=3$ and $p=7$, we find optimal sequence coefficients $a=[6,2,5]$, which gives us a maximum length sequence of 342 elements and a practical sequence length of 57 elements. Plotting the entire sequence we see that it looks nicely random and does not repeat.

Figure 5. Plot of a=[6,2,5], p=7

Binary Maximum Length Sequences

As we discovered for maximum length sequences, we need to test all $x in$ {divisors of $p^d-1$} to make sure $bar A^(x) mod p!=bar I$. It would be nice if we could find $(p,d)$ combinations so that $p^d-1$ was prime, then there would not be any divisors and we would then only to test that $bar A^(p^d-1) mod p=bar I$. Unfortunately it is easily shown that

$p^d-1=(p-1)(1+p+...+p^(d-1))$

So $p^d-1$ always has at least two factors unless $p=2$. If $p=2$ then we can find $2^d-1$ that are prime. These are the so-called Mersenne prime numbers. An advantage of using $p=2$ is that integer multiplications can be replaced with logic operators. Depending on the system, Mersenne prime PRNG with logic operations (instead of multiplies) may be more efficient way to produce a random sequence. We could produce very long binary sequences similar to how the Mersenne Twister PRNG does.

Efficiency considerations

Multiplication is often considered to be slow. We can easily transform our PRNG to use bitshifts instead of multiplication by testing for matrices that contain elements that are a power of 2 only. Our recurrence forum can then become

$x_i=sum_(j=1)^d x_(i-j)$<<$n_j$

where $n_j$ exponent for the matrix element of the power of $a_j=2^(n_j)$ and << is the bitshift operator.

While this seems like it should be faster than the multiplication version, when I tested this on my PC with Intel i7 chip and there was no difference. There is a lot of dedicated multiplication circuitry on many modern CPUs, so on some architectures it seems there is very little difference in speed.

To obtain maximum speed it seems much more important to minimize use of the modulus operator, which is computationally expensive (here I did notice a significant difference in performance). One has to be aware of possible overflow and make sure that all multiplies and additions will fit within the integer size used to calculate the results. Ideally one would like to get rid of the modulus operation altogether by using a value of $p$ that is a power of 2. However $p$ of a power of 2 are not primes and will have a lot of factors ($n$ factors for $2^n$). I'm not sure if it's even possible for maximal length sequences to exist with non-prime $p$, although if one is really concerned about speed, looking into 'long' sequences (if not maximal) for $p=2^n$ might be desirable. Eliminating multiplication and modulus operations would result in the fastest PRNG possible as the only operations would be bitshift and addition.