intro

Computational Techniques in Theoretical Physics

Section 3:
Random numbers and random number generators.

What is a Random Number?

Numbers that are ``chosen at random'' .

Applications:

Simulation of gambling.

Any other events that are random in nature.

Computer simulation of physics problems:

Statistical mechanics (temperature effect).
Random collision.
Systems with randomly distributed subunits (defects etc.).

Mathematics problems:

Solving Integrals.
Solving Schrodinger equations.

Is it possible to generate random numbers?

A truly random variable cannot be generated by computer using deterministic methods.

But we can generate random number sequences which are fairly close to the properties of a random variable for most practical applications.

Form of random number:

Uniformly distributed random variables or random numbers.

A uniform distribution on a finite set of numbers (integers) is one in which each possible number is equally probable. A distribution is generally understood to be uniform unless some other distribution is specifically mentioned.

Random variables follow a special distribution pattern:

Gaussian distribution.
Quantum physics: wavefunction

Random number generation:

Brief History:
The history of generating random numbers is interesting. At first, people who needed random numbers in their scientific work would draw balls out of a ``well-stirred urn'' or would roll dice or deal out cards. A table of over 40,000 random digits, ``taken at random from census reports,'' was published in 1927 by L. H. C. Tippett. Since then, a number of devices have been built to generate random numbers mechanically; the first such machine was used in 1939 by M. G. Kendall and B. Babington-Smith to produce a table of 100,000 random digits, and in 1955 the RAND Corporation published a widely used table of a million random digits obtained with the help of another special device. A famous random-number machine called ERNIE has been used to pick the winning numbers in the British Premium Bonds lottery.
Shortly after computers were introduced, people began to search for efficient ways to obtain random numbers within computer programs. A table can be used, but this method is of limited utility because of the memory space and input time requirement, because the table may be too short, and because it is a bit of nuisance to prepare and maintain the table. Machines such as ERNIE might be attached to the computer, but this would be unsatisfactory since it would be impractical to reproduce calculations exactly a second time when checking out a program; and such machines have tended to suffer from malfunctions that are difficult to detect.
The inadequacy of these methods led to an interest in the production of random numbers using the arithmetic operations of a computer. John von Neumann first suggested this approach in about 1946, using the ``middle-square'' method. His idea was to take the square of the previous random number and to extract the middle digits; for example, if we are generating 10-digit numbers and the previous value as 5772156649, we square it to get
33317792380594909201,
and the next number is therefore 7923805949.
There is a fairly obvious objection to this technique: how can a sequence generated in such a way be random, since each number is completely determined by its predecessor? The answer is that this sequence isn't random, but it appears to be. In typical applications the actual relationship between one number and its successor has no physical significance; hence the nonrandom character is not really undesirable. Intuitively, the middle square seems to be fairly good scrambling of the previous number.
Sequences generated in a deterministic way such as this are usually called pseudo-random or quasi-random sequences in the highbrow technical literature, but here we shall simply call them random sequences, with the understanding that they only appear to be random. Being ``apparently random'' is perhaps all that can be said about any random sequence anyway. Random numbers generated deterministically on computers have worked quite well in nearly every application, provided that a suitable method has been carefully selected. Of course, deterministic sequences aren't always the answer; they certainly shouldn't replace ERNIE for the lotteries.
Von Neumann's original ``middle-square method'' has actually proved to be a comparatively poor source of random numbers. The danger is that the sequence tends to get into a rut, a short cycle of repeating elements. For example, if zero ever appears as a number of the sequence, it will continually perpetuate itself.
Many random number generators in use today are not very good. There is a tendency for people to avoid learning anything about such subroutines; quite often we find that some old method that is comparatively unsatisfactory has blindly been passed down from one programmer to another, and today's users have no understanding of its limitations. We shall see that it is not difficult to learn the most important facts about random number generators and their proper use.

Linear congruential method
By far the most popular random number generators in use today are special cases of the following scheme.

We choose four ``magic numbers'':
m,      the modulus;           m > 0
a,       the multiplier;         0 < a < m
c,       the increment;         0 < c < m
X₀,    the starting value;   0 < X₀ < m
The desired sequence of random numbers X_n is then obtained by setting
X_n+1 = (a X_n + c) mod m,     n > 0,        (1)
where a, c, and m are properly chosen integer constants. This is called a linear congruential sequence. Taking the remainder mod m is somewhat like determining where a ball will land in a spinning roulette wheel.
For example, the sequence obtained when m = 10 and X₀ = a = c = 7 is
7, 6, 9, 0, 7, 6, 9, 0, ...
As this example shows, the sequence is not always ``random'' for all choices of m, a, c, and X₀; the principles of choosing the magic numbers appropriately will be outlined.
The above example illustrates the fact that the congruential sequences always ``get into a loop''; i.e., there is ultimately a cycle of numbers that is repeated endlessly. This property is common to all sequences having the general form X_n+1 = f(X_n). The repeating cycle is called the period.

Choice of modulus.

The value of m need to be rather large, since the period cannot have more than m elements.

Speed of generation: We want to pick a value so that the computation of a (X_n + c) mod m is fast.

m can be conveniently chosen as the word size of our computer:
Let w be the computer's word size, namely, 2^e on an e-bit binary computer (e=16 on Intel 8086 processor, e=32 on Silicon Graphics and 64 on Cray). The result of an addition operation is usually given modulo m; and multiplication mod w is also quite simple, since the desired result is the lower half of the product. Hence the C statement
x = a * x + c;
is the same as a x+ c modulo 2^e, assuming that a, x, and c are declared as unsigned int. Quick tricks for modulo w + 1 also exist if one writes in assembly languages.

Choice of multiplier.

The multiplier should give the period of maximum length. A long period is essential for any sequence that is to be used as a source of random numbers.

With only m different values are possible, the period surely cannot be longer than m. Can we achieve the maximum length, m?

Theorem A.

The linear congruential sequence defined by m, a, c, and X_0 has period length m if and only if

c is relatively prime to m;
b = a - 1 is a multiple of p, for every prime p dividing m;
b is a multiple of 4, if m is a multiple of 4.

Because of the strong condition for obtaining the period m, the random number generators in use typically have period few times smaller than m.

Commonly used parameters for linear congruential generators

a	m	c	period
7⁵	2³¹-1	0	2³¹-2
1664525	2³²	1013904223	2³²
69069	2³²	0	2³⁰
6364136223846793005	2⁶⁴	1	2⁶⁴

For the random number generators with period 2³⁰, it takes only a hour to finish the sequence (the full period) on our workstations.

Generating random numbers in C:

Eq.(1) can be efficiently implemented if m=2^e on an e-bit computer.

In such a case, the modulo operation is really unnecessary (and slow) since it is automatic. In multiplication, if the result is larger than the maximum possible value in a computer, the higher order bits are truncated.

As an example, in C we could write two lines of code

x = 1664525U * x + 1013904223U;
r = x * 2.328306437e-10;

each time a random number r in the range (0,1) is needed. The variable x must be unsigned and r could be double or float. It is more efficient to use the uniformly distributed random integer x between 0 and 2³²-1 directly.

C compilers supply a standard random integer function, rand() (header file <stdlib.h>).

Many implementations give only a 16-bit random number, which is not acceptable for Monte Carlo calculation, and should be avoided. A better choice is the drand48() which returns a double and is generated with 48-bit integers. The 64-bit random number generator above can be implemented with long long int type in SGI workstations.

Online slides

Homework 2