13 12 2 4 8 4 15 14 14 8 0 15 5 4 1 16 1 18 5 12 5 6 19 13 5 19 18 4 5 3 0 19 16 3 3 4 7 19 18 1 7 18 16 13 2 18 9 4 17 14 17 3 1 16 16 6 15 14 11 1 18 11 1 14 14 5 19 1 4 17 3 12 16 19 5 19 18 15 4 15 10 1 19 11 18 15 18 14 10 9 15 9 0 16 4 14 1 3 16 6 1 19 19 17 18 5 16 17 0 0 12 11 2 12 2 0 7 0 14 18 9 10 8 9 7 12 4 8 16 0 15 18 19 15 15 18 0 12 15 1 13 8 12 15 0 15 15 8 15 10 6 4 0 15 14 7 8 18 16 4 19 12 2 18 7 18 16 8 10 11 10 3 19 3 19 0 18 15 8 13 5 15 18 6 10 13 13 18 12 10 3 11 2 5 9 10 4 6 19 14 18 10 18 18 13 17 18 12 12 7 6 18 2 5 4 12 18 18 10 11 8 13 2 11 19 12 1 3 19 1 18 17 11 17 15 4 15 14 17 8 1 3 6 3 8 10 16 6 8 7 18 17 1 0 8 0 13 9 4 12 10 3 9 2 1 5 7 16 0 4 4 1 7 10 5 16 1 1 3 10 8 1 7 9 1 15 10 14 5 15 7 16 19 16 18 0 2 5 17 2 10 1 4 17 12 9 13 14 11 16 4 0 18 11 9 19 7 0 14 12 16 1 8 15 18 7 15 1 12 12 3 2 14 8 0 7 17 14 1 9 11 6 9 9 18 18 9 5 19 3 18 15 5 6 11 4 13 7 5 6 0 9 9 14 18 9 2 15 3 3 4 14 10 13 4 8 12 13 13 12 17 11 8 3 18 19 7 12 7 13 18 7 3 7 2 1 17 4 17 1 8 2 16 18 16 0 6 9 13 0 2 11 12 10 14 11 10 2 3 17 15 2 5 19 10 7 0 7 12 18 9 0 0 5 19 17 5 5 6 19 6 8 10 18 19 5 9 9 7 13 7 3 16 12 2 6 19 3 14 12 2 3 12 3 8 11 0 14 17 7 13 4 16 4 2 15 10 12 5 17 6 12 0 2 4 3 8 4 7 3 16 9 6 9 12 15 1 12 9 19 0 3 3 16 8 6 11 18 19 16 16 5 9 17 7 13 0 16 18 8 19 14 17 6 3 9 2 5 2 12 4 2 15 7 18 4 13 10 3 13 7 19 18 16 17 6 9 18 2 8 6 2 2 3 8 6 13 10 11 15 2 16 18 18 3 17 3 17 7 6 10 14 6 9 10 3 15 0 1 17 8 8 0 11 11 8 18 5 19 10 1 2 6 19 1 10 17 4 8 4 11 19 18 18 8 9 1 4 10 3 1 18 11 1 10 3 10 8 9 10 18 10 12 4 10 14 15 7 18 3 11 10 3 10 8 11 0 10 15 10 13 17 9 5 19 19 9 10 7 18 0 5 8 12 10 19 6 6 6 5 9 18 15 13 9 4 4 10 14 0 0 7 18 9 13 17 9 2 7 16 0 7 2 9 0 13 8 7 19 15 13 9 14 9 2 3 13 7 14 8 8 14 15 6 4 9 3 13 11 11 10 12 19 13 1 0 7 9 8 6 5 1 16 19 10 19 3 4 7 17 12 15 12 8 1 16 17 5 10 9 17 1 1 16 14 2 17 1 12 5 8 17 6 5 16 17 4 0 2 11 17 14 6 9 3 8 6 1 14 16 10 11 18 12 7 12 15 4 14 7 9 3 5 16 8 2 14 13 2 16 4 19 11 11 9 14 0 15 15 14 11 6 5 10 19 13 2 14 18 17 1 8 0 6 5 9 8 19 2 11 15 7 10 6 19 19 1 19 14 17 14 6 3 0 16 3 13 19 17 12 17 18 0 18 5 5 7 14 4 10 5 0 17 16 6 17 16 8 17 11 5 11 18 9 11 15 12 5 14 9 17 12 7 18 10 13 4 18 7 9 8 13 9 5 10 16 3 6 4 0 17 10 11 15 19 3 11 11 9 5 0 7 18 8 6 8 2 10 6 9 19 14 3 9 0 13 6 3 19 11 4 17 1 15 13 0 19 4 12 8 10 12 16 8 1 2 17 3 12 3 13 12 18 16 2 19 10 8 3 10 0 7 7 1 3 1 2 2 5 14 11 16 7 8 5 8 10 2 12 3 6 5 15 4 2 18 4 12 7 7 3 7 14 10 9 18 12 11 0 18 5 12 14 13 0 19 2 11 2 14

"Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin"
--John Von Neumann


These 1000 numbers ranging from 0 to 19, were produced with this simple Perl script:


#!/usr/bin/perl

for($i = 0; $i < 1000; $i++){
  print int(rand(20))," ";
}

At a first glance, these numbers seem random. But it is well-known that no number sequence produced by a computer can be truly random. It is very easy to observe that a specific number sequence is not random (say, the sequence 2, 3, 5, 7, 11, 13...) but the opposite, that is, to certify that a given sequence is really random, is a hard job. What we are trying to do when we produce "random" sequences with our computer, is to try to produce a sequence that has many of the properties of really random sequences. That's why these sequences are usually referred to as "pseudorandom sequences".

The three most important properties that a good pseudorandom sequence must hold are:

  1. It looks random. This means that it passes all the statistical tests of randomness that we can find.
  2. It is unpredictable. It must be computationally infeasible to predict what the next random bit will be, given complete knowledge of the algorithm or hardware generating the sequence and all of the previous bits in the stream.
  3. It cannot be reliably reproduced. If you run the sequence generator twice with the exact same input (at least as humanly possible), you will get two completely unrelated random sequences.

(taken from Bruce Schneier's Applied Cryptography)

Very often, it is next to impossible to say which exactly properties of random numbers are vital for a specific application. But on the other hand, it is always a good idea to run some kind of checks to a pseudorandom number generator, so that we can be sure that, at least, no degenerate sequences are produced. There can be generators that are really good, but a flaw in the design can often result to a very bad pseudorandom number generator.

There is a variety of checks that exist to test the properties of pseudorandom number sequences, and most of them are based on mathematics and statistics. In this writeup we examine the X2 statistical test (chi-square test, also mentioned in Bruce Schneier's Applied Cryptography, second edition, page 45) on random sequences. The basic idea of the X2-test is to check whether our numbers are "logically" distributed. That is, we make a comparison with the sequence we have against a random sequence. So, the chi-square test is a goodness-of-fit test (thanks to vuo for reminding me this). If we produce N positive integers, smaller than r, then we would expect to have N/r numbers for each value between 1 and r. But the frequencies of appearance for each number should not be exactly equal: That would not be random!

It is proved that it can be checked whether a number sequence is equally good distributed as a real random number sequence or not, using the following formula:

             Σ ( fi - N/r )2
            0 ≤ i ≤ r
X2 = --------------------------
                     N/r

This is the X2-test, and if the value of X2 is around r, then the numbers are random-like distributed, else they are not. "Around" is defined as X2 being in the range r ± 2√r. This is enough for a simple test of randomness, and it must be noted that first, the test is OK for N > 10•r, and just to be sure about the randomness of our pseudorandom number generator, the check must be run a couple of times with different pseudo random sequences.

So, for the 1000 numbers at the beginning of the writeup, the X2-test yields X2 = 15.04 which is within the range 20 ± 8.94. That means that the random number generator embedded into Perl is worth its money, thus defeating what they say, that free stuff is worth what you pay for ;-)

Of course, for applications that base their security on how really random a number sequence is, like cryptography, something more than the Perl or C embedded random function is needed. Or else 'weird correlations' will start to appear, and some cryptanalyst somewhere will be veeery happy about it.


Bibliography
Algorithms: Design methods and complexity calculation, Foto Afrati, Giorgos Papageorgiou
Applied Cryptography, Bruce Schneier

Log in or register to write something here or to contact authors.