intro

Computational Techniques in Theoretical Physics Section 12 Metropolis algorithm and application examples

Key Point

Metropolis algorithm

A useful method to generate an importance Markov chain in Monte Carlo sampling.

Applications

(1) generation of Gaussian distribution
(2) Ising model for magnets

Metropolis algorithm

An importance Markov chain Monte Carlo sampling for a specific distribution P(X) goes as follows:

An initial state (configuration): X₀

A sequence of states generated by a Markov chain through the transition probability W(X_i to X_i+1):

X₀ => X₁ => ... => X_i => ... => X

The starting state may be a unique state or may be generated at random with certain distribution P₀.
The new state X_i+1 is based on the old state X_i. The new state is generated in such a way so that the probability that the system is in state X_i+1 is just W(X_i to X_i+1)

Requirement on W(X_i to X_i+1):

It satisfies the detailed balance condition.

Easy to generate.

W(X_ito X_i+1) from Metropolis algorithm:

W(X -> X') = T(X -> X') min(1, P(X')/P(X))

where X =/= X', T(X -> X') = T(X' -> X)

The transition matrix T can be any distribution but must be symmetric.
Normally, it is chosen as:

                           const.    for X' inside a region around X
T(X -> X') = {
                           0            otherwise

The probability that the state does not change, W(X to X), is determined by the normalization condition:

sum_X' W(X to X') = 1

Complete description of Metropolis algorithm:

Task:
Generate a sequence X₀, X₁, X₂, ... , X_i, ... , with a probability distribution P(X_i).

Steps:

Specify an initial state X₀, set i = 0.

Generate a new state X'_i+1 according to T(X_i to X'_i+1).

Accept the new state with probability: min(1,P(X'_i+1)/P(X_i)).

That is, X_i+1 = X'_i+1 if a random number between 0 and 1 is less than P(X'_i+1)/P(X_i); retain the old state as (i+1)-th state, X_i+1 = X_i, otherwise.

set i= i + 1, go to step 2.

Many physical quantities can be computed as expectation values.
Expectation values (or averages) can be estimated by:

< A(X) > = sum_X A(X) P(X)
                ~ N^-1 sum _{i = M+1 to M+N} A(X_i)

It is important that we throw away the first M points in the average. We have to wait for the system approaching equilibrium before we use the points.
The reason for this is that the system is distributed according to P_k(X), and we want the distribution P(X). P_k(X) will converge to P(X) for sufficiently large k.
This convergence is usually fast. For efficiency consideration, usually we do not use every X_i even if we have reached equilibrium, since they are highly correlated. We calculate A(X_i) only every certain number of iterations or steps.

Examples of Metropolis Algorithm

Gaussian distribution

Starting with some arbitrary real value x

we then propose a new value as

x' = x + alpha (xi - 1/2)

where xi is a uniformly distributed random number between 0 and 1; alpha is some constant, say, 0.3.
The newly proposed value x' is distributed uniformly inside an interval of length alpha, centered at the original value x.

This choice of x' is equivalent to take a particular functional form of T(x to x'). As long as T is symmetric, the exact form of T does not matter as far as generating correct distribution is concerned.

The value x' will be accepted as the next point if a uniformly distributed random number between 0 and 1 is less than:

exp{ - 1/2 (x'2 - x2)}

otherwise, the old value x is taken as the next point.

This generates a sequence of points, which should be distributed according to Gaussian with zero mean and unit variance.

Ising Model for Magnets

What is Ising model?

Ising model is the simplest model for ferromagnets.

Important features of a magnet:

spontaneous magnetization at lower temperatures.

the magnetization disappears at some higher temperature Tc.

The phase below Tc is called ferromagnetic phase and that above Tc is called paramagnetic phase. Tc is called critical temperature and there is a phase transition at this temperature.

Two-dimensional system as an example:

On a L x L square lattice, we put a spin sigma_i at each lattice site i.
The spin here is a `classical' spin, not the quantum-mechanical operator. The spin can be thought as the z-component of the Pauli spin, taken only two values +1 and -1.

For each configuration denoted by:
{sigma} = (sigma₁, sigma₂, sigma₃, ...)
which is the collection of all the spins, the system has a definite energy given by:
H(sigma) = - J sum _{<i, j>}sigma_i sigma_j
                    - B sum_i sigma_i
where J is called coupling constant; B is proportional to the external magnetic field.
The first summation runs over the nearest neighbors only, i.e., each bond is summed once.

Such a model is also referred as nearest neighbor Ising model.

The second summation is over all the lattice sites.
When J is positive, the model describes a ferromagnet; while J < 0 describes an anti-ferromagnet.

Let's consider just one term in the nearest neighbor interaction. The interaction energy of neighbors takes only two values, +J or -J. For ferromagnet, the energy is lower if the two spins are parallel. It costs energy if the spins are pointing in different directions.

Statistical-mechanical description of Ising model:

Due to thermal fluctuation, the system will have some probability distribution in states other than equilibrium one.

If the temperature T is a fixed parameter, we have the famous Boltzmann distribution (Canonical distribution):

P( { sigma } ) ~ exp [ -H( { sigma } ) / k_B T ]
where k_B~ 1.38 x 10^-23 Joule/Kelvin is called the Boltzmann constant.
Our goal is to calculate average quantities such as energy, magnetization, correlation functions, etc.

Definition of physical quantities:

Internal energy per spin

u = 1/L^d < H >
   = 1/L^d sum _{{
sigma }}H( { sigma } )

Magnetization

m = 1/L^d < M > =

    = 1/L^d sum _{{
sigma }}   | sum _isigma_i | P( { sigma } )

The summation is over all possible states.
Since each site can have two states (spin up and spin down), we have 2^Ldstates for a system of linear size L in d dimensions.

Specific heat

c = du/dT
   = [ < H² > - < H >² ] / k_BT² L^d

Susceptibility

chi = dm/dB
               A ( < M² > - < M >² )   T < T_c
       = {
               A < M² >                          T > T_c

where A = 1/ k_BT L^d, the angular brackets < > denote average over the Boltzmann distribution and

M = | sum_i sigma_i |
is the absolute value of the total magnetization of a particular state.

The fourth-order cumulant

g = 1/2 [ 3 - < M⁴ >/ < M² >² ]

which is useful in determining the critical temperature T_c.

Monte Carlo simulation of the Ising model:

Steps:

Initialize sigma_i at each site with arbitrary spin values, e.g., all spin up, or up or down with equal probability.

The complete set of spins { sigma } is the abstract state X discussed in the previous sections.

Choose a site i at random and propose to flip the spin at that site, sigma'i = - sigmai.

The proposed state X' is the one with one spin at location i flipped. This amounts to take:

                                      1/L^d, if X and X' differ by 1 spin
             T( X to X') = {
                                      0,      otherwise

Calculate the energy increment

E'-E = H( { sigma' } ) - H( { sigma } )
         =   2J sigma_i sum _{neighbors of i} sigma_j
           + 2B sigma_i

Accept the proposed state as the new state if a uniformly distributed random number (between 0 and 1) is less than exp[- (E'-E)/kB T]; retain the old state as the new state otherwise.

go to step 2.

Homework