intro

Computational Techniques in Theoretical Physics Section 10 Monte Carlo Evaluation of Finite-Dimensional Integrals

Illustration of method

Let X = (x1, x2, ... , xd) be a point in a d-dimensional space,
and dX = dx1 dx2 ... dxd the volume element.

We want to evaluate the integral over the domain Omega:

G = int _Omega g(X) p(X) dX

where p(X) is interpreted as probability (density) satisfying

p(X) > 0,
int _Omega p(X) dX = 1

The basic Monte Carlo method of calculating the integral is to draw a set (sequence) of random values

X1, X2, ..., XN

distributed according to the probability p(X), and to form the arithmetic mean

G_N = sum _{i=1
to N} g(Xi) / N

The quantity GN is approximately G, and we say G_N is an estimator of G.

It is crucial to understand why p(Xi) did not multiply g(Xi). If Xi were equally spaced in the domain Omega, we would include p(Xi). Since Xi is distributed according to p(Xi)---there are more points where p(Xi) is large---we already take it into account.

How to divide g(X) and p(X)?

Very often the probability p(X) itself is a uniform distribution p(X) = constant in the domain Omega. Other times, we need other form of distribution, e.g., in statistical mechanics, we like to have Boltzmann distribution, p(X) ~ e^-E/kBT. Splitting the integrand g(X) p(X) into a probability p(X) and a quantity g(X) is somewhat arbitrary and is for the convenience of computation. However, the choice of p(X) does influence the statistical accuracy of the Monte Carlo estimates.

Fundamental theorems

We'll just state two important theorems which form the theoretical basis of the Monte Carlo method.

(Strong) Law of Large Numbers:

The arithmetic mean G_N converges with probability one to the expectation value G. In mathematical notation

P{ lim_{N to infinity} G_N = G } = 1.
This theorem guarantees the convergence of Monte Carlo estimates to the correct answer in the limit of infinite many number of points. But it does not tell us how fast it converges. The next theorem tells us more.

Central Limit Theorem:

Let's define the variance as

sigma² = int_Omega (g(X) - G)² p(X) dX
= < g2 > - < g >2
and a normalized value (also a random variable)

t_N = sqrt{N} ( G_N - G ) / sigma
then

lim _{N to infinity} P{ a < t_N < b } =
int _{a to b} 1/sqrt(2pi) e^{-1/2 t2} dt.
In other words, G_N is a random variable distributed according to Gaussian distribution with mean G and variance sigma²/N for sufficiently large N.

As N goes to infinity, the observed G_N turns up in ever narrower interval near G and one can predict the probability of deviations measured in units of sigma. The observed G_N is within one standard error (i.e. sigma/sqrt{N}) of G 68.3% of the time, within two standard errors 95.4% of the time, and within three standard errors 99.9% of the time. This is due to the property of the Gaussian distribution.

Error of Monte Carlo calculation

From the central limit theorem, we have

G = G_N + error

The error of Monte Carlo estimates for G is

| error | = epsilon ~ sigma / sqrt{N}

From the definition of sigma, we can also estimate the error epsilon as well as the value G itself:

sigma² = int_Omega (g(X) - G)² p(X) dX
= < g2 > - < g >2

However, it is better to use the following unbiased estimator for sigma2:

sigma2 ~
N/(N - 1){1/N sum _{i=1 to N} g²(Xi) - (1/N sum _{i=1
to N} g(Xi))²}

Monte Carlo error decreases as 1/sqrt{N}, as the number of samples (points) N increases. Since N is proportional to the computer CPU time T, error epsilon is proportional to T^-1/2.

Let's compare the efficiency of Monte Carlo method with that of deterministic method of quadrature (e.g. trapezoidal or Simpson rule). The deterministic methods typically have error

epsilon ~ h^k

due to discretization, where h is grid size. The computer time goes as

T ~ h^-d

in d dimensions when the grid size is decreased. Thus error goes as

epsilon ~ 1/T^k/d

For Simpson rule of numerical integration, k = 4. We see that for a one-dimensional integral, the error decreases much faster than that of Monte Carlo method. However as the dimensions of the integral increase, the Simpson rule becomes worse than the Monte Carlo method for a fixed amount of computer time.

In conclusion, standard numerical quadrature is very good for low dimensions. Monte Carlo method is superior for higher dimensional integrations.

Some examples

Example 1:

Evaluate the integral

int _{0 to
1} sqrt{1 - x²} dx = pi/4.
Solution:
Let's take the probability distribution as

               1 ,   0 < x < 1
p(x) = {
               0 ,   otherwise
so that x is uniformly distributed in the unit interval. And

g(x) = sqrt{1 - x2}
we can cast the integral in the required form

int g(x) p(x) dx.
The Monte Carlo estimator for the integral is

G_N = 1/N sum _{i=1 to
N} g(xi)
       = 1/N sum _{i=1
to N} sqrt{1 -xi²}.
This can be implemented in C with a few lines:

sum = 0;
for(i = 0; i < N; ++i)
sum += sqrt(1 - pow(drand48( ), 2));
mean = sum / N;

Example 2:

Evaluate the integral

int _{0 to
infinity} e^-x cos x dx
and estimate the Monte Carlo error.

Solution:
Since the integral domain is from 0 to infinity, it is impractical to sample the value x uniformly from 0 to infinity. However, it is quite natural to sample x from 0 to infinity by the exponential distribution:

p(x) = e^-x
and then

g(x) = cos(x)
Even though x can take any positive value, but the probability that it takes large value is very small. We sample the exponential distribution by

x = - ln xi
where xi is a random number uniformly distributed between 0 and 1.
A Monte Carlo estimator for the integral is then

G_N = 1/N sum _{i=1 to
N} cos( ln xi_i)
Question: why do we omit the exponential factor?
We can also estimate the error sigma/sqrt{N} in G_N by a Monte Carlo calculation of sigma:

sigma² ~
N/(N - 1) {1/N sum_{i=1 to N} cos2( ln xi_i) - G_N² }

Homework