Doubling in backgammon

You are playing backgammon (wlog as white). (The rules of backgammon are explained more clearly here.) At any given point during a round, there are six possible outcomes of the round, with corresponding values:

  • white wins by a backgammon (+3n)
  • white wins by a gammon (+2n)
  • white wins (+n)
  • red wins (-n)
  • red wins by a gammon (-2n)
  • red wins by a backgammon (-3n)

At the start of a round, n=1 and both players have control of the doubling cube.

When it is Alice’s turn, she may offer a double if and only if she controls the doubling cube. If Alice offers a double, then Bob must accept or decline. If Bob declines, then Bob forfeits the round and Alice gains n points. If Bob accepts, then Alice ceases to control the cube, Bob gains control of the cube, and n is multiplied by 2.

(There are other rules.)

The problem

Suppose that both you and your opponent can perfectly calculate the probability of each of the above outcomes.

  • It is your turn. Should you offer a double?
  • It is your opponent’s turn, and they have offered a double. Should you accept?

For a harder problem, suppose that you assign probabilities p_i to each of the above outcomes, and your opponent assigns different probabilities q_i, but you know both the p_i and the q_i.

Solving ODEs in C

I set my students an exercise: to solve the nonlinear differential equation

\ddot{x} + \sin(x) = A\sin(\omega t).

What follows is an outline of a model solution (but by no means complete, or optimal). I take this opportunity to introduce several other features of C that I haven’t had the chance to talk about in the classes.

This equation is, of course, the (nondimensionalised) equation of motion of a forced pendulum, where x represents the angle that the pendulum makes with the downward vertical. If A = 0 or \omega = 0, then the pendulum is unforced, and executes regular oscillations. The period of these oscillations is approximately equal to 2\pi for small-amplitude oscillations (where the motion is approximately simple harmonic motion), but increases as the amplitude increases. If forcing is turned on, then a second period—that of the forcing—is introduced. The pendulum may still oscillate with its ‘intrinsic’ period, or if the forcing is strong enough then the pendulum may start oscillate with the period of the forcing. If the period of the forcing is close to the intrinsic period of the pendulum, then resonance may occur.

We first write the equation as a system of two equations,

\dot{x} = y \text{ and } \dot{y} = - \sin(x) + \sin(\omega t)

which is first-order. We then solve this system numerically using the Runge-Kutta method. The program diffsolve is meant to do this using a Runge-Kutta solver. Its full source code can be downloaded here. The code, as it is written, is quite general; it can solve any system of two ODEs of the form

\dot{x}=f(t,x,y) \text{ and } \dot{y}=g(t,x,y).

Annotation of the code

After the initial comments and the #includes, we first come across a typedef struct:

typedef struct {
    double amp;
    double omega;
} problem_t;

I declare a type, problem_t, which will store the parameters of the problem, amp and omega. I usually wrap all parameters into a struct like this, because it lets me pass a single variable, of type problem_t, to a function, instead of passing each of the individual parameters. It also means that, if I want to change my program to allow more parameters, I can simply change the declaration of problem_t, rather than changing the definition of every function that needs these parameters. This pays off when we work with problems with ten or twenty parameters.

The first function defined makes use of this:

double forcing(double t, problem_t params) 
    return params.amp * sin( * t );

This is used by the function rhs_y, which calculates \dot{y} . (You could do away with the function forcing and have the above inside rhs_y, if you prefer.)

The function rhs_x, which gives the derivative \dot{x} , appears to require arguments that it does not use:

double rhs_x(double t, double x, double y, problem_t params)
    return y;

Although the arguments t, x and params are not actually used, I have chosen to include them anyway, because in the future I might want to move from solving \dot{x} = y to something more general, of the form \dot{x} = f(t,x,y) .

Now we come to the main function. The first thing we notice is that the function prototype says

int main(int argc, char* argv[]) 

instead of simply

int main()

We want to be able to run our program without having to recompile the program each time we want to change our parameters such as initial conditions. We will type something like

$ ./diffsolve 1 2 100 0.005 1 0

into the command line, in order to specify the different parameters to diffsolve. These are command line arguments, and they are treated by C as arguments to main. The argument argc is the argument count, the number of arguments; and argv is the argument vector, an array of strings. Note that the program’s name, ./diffsolve, is treated as the 0th argument, and so argc is always at least equal to 1.

The arguments are actually read in the following block:

if (argc > 6) 
    params.amp   = atof(argv[1]); = atof(argv[2]);
    tmax  = atof(argv[3]);
    dt    = atof(argv[4]);
    x     = atof(argv[5]);
    y     = atof(argv[6]);
    fprintf(stderr, "Usage: %s amp omega tmax dt x0 y0\n", argv[0]);

If the user has specified at least seven arguments (where the 0th argument is the name of the program), then these arguments are taken. The atof function converts strings to floats (although our variables are declared as doubles; this is fine, because floats can be implicitly cast into doubles).

If the user has not specified enough arguments, then the program gives an error which tells the user how to use it, and exits with a return code of -1, which (by convention) indicates an error. (The exit function is provided in stdlib.h.) The error message is printed using fprintf, which stands for ‘printf to a file’, or ‘print formatted text to a file’. Here the ‘file’ is stderr, the standard error stream. In contrast, the function printf writes to the standard output stream, represented by stdout. By default, both stdout and stderr write to the screen, but the two streams may be separated from each other. It is a good idea to use stderr for errors, warnings or diagnostic messages, and stdout for actual output that you want to process.

The rest of the program is fairly straightforward: the values of t, x and y are updated in a while loop, and at each time step their values are printed (to standard output).

Usage (on Unix-like systems)

After compiling, the command

$ ./diffsolve 1 2 100 0.005 1 0

will print plenty of lines, giving the solution of the system at each timestep.

You probably want to save this information to another file, which you can do by writing

$ ./diffsolve 1 2 100 0.005 1 0 > out.txt

This will write the results to the file out.txt, which you can then process somewhere else. For example, you can use gnuplot to plot the results:

$ gnuplot
gnuplot> set terminal png
gnuplot> set output "out.png"
gnuplot> plot "out.txt" using 1:2 with lines

This produces the image out.png:

Mathematica modules

In Mathematica, a module is an extended function that may perform several steps. An advantage of a module is that it may declare and define local variables which can be used inside the module but not in the global scope. It is generally good programming practice to keep variables inside as local a scope as possible, because variables in global scope can conflict with each other if you are not careful to keep track of their names.

A module for solving the KdV equation

Below is an example usage of a module. The module solat computes the solution to the KdV equation

u_t + 6uu_x + u_{xxx} = 0

at a desired time tm, given initial data data0. It uses a split-step pseudospectral integrator, whose details, as well as an equivalent code in Matlab, can be found at Wikiwaves. (Note that their Matlab code makes reference to k, the angular wavenumber, whereas this code uses \nu, which is equal to k / (2\pi) .

nx = 128;
xmax = 1;
dx = 1/nx;
xs = Range[0, xmax - dx, dx];
numax = 1/(2*dx);
dnu = 1/xmax;
nus = Join[Range[0, numax - dnu, dnu], Range[-numax, -dnu, dnu]];

u0[x_] := Sin[6 Pi x];
data = u0 /@ xs;

solat[tmax_, data_] := 
 Module[{t = 0, dft = Fourier[data], dt = 0.4/nx^2, dft1 = 0, sol = 0},
  While[t < tmax,
   dt = Min[dt, tmax - t];
   dft1 = dft*Exp[I (2 Pi nus)^3 dt];
   dft = dft1 - 3 I 2 Pi nus dt Fourier[InverseFourier[dft1]^2];
   t += dt;
  sol = Re[InverseFourier[dft]];

We declare the variables t, dft, dt, dft1 and sol as local to the module, and initialise them to sensible values. (Note that dft1 and sol has been initialised to 0; this is completely arbitrary, because this value will be overwritten a few lines later.)

The last line inside the module, sol, simply causes sol to be returned from the module.

A module for solving the sine-Gordon equation

We now consider the sine-Gordon equation

\phi_{tt} - \phi_{xx} + \sin\phi = 0.

Like KdV, the sine-Gordon equation is quasilinear, in the sense that no term contains any nonlinear function of a derivative. Moreover, in both equations there is only one nonlinear term and no inhomogeneities; the rest are linear and may be treated easily. From a computational perspective, the most important difference between the two is that the sine-Gordon equation is second-order in time. We first write it as a pair of coupled first-order equations:

\phi_t = \psi \text{ and } \psi_t = \phi_{xx} - \sin\phi.

This system can now be solved using a split-step pseudospectral method, much like the above, except that we must keep track of both dynamical variables \phi and $\latex \psi$. The derivation of this is a bit messy, but follows the same principle of evolving the linear parts of the equations separately from the nonlinear part. Here is the Mathematica code:

nx = 128;
xmax = 1;
dx = 1/nx;
xs = Range[0, xmax - dx, dx];
numax = 1/(2*dx);
dnu = 1/xmax;
nus = Join[Range[0, numax - dnu, dnu], Range[-numax, -dnu, dnu]];

u0[x_] := Sin[6 Pi x];
data = u0 /@ xs;
sinegordonsol[tmax_, phi0_, psi0_] := Module[
  {t = 0, phift = Fourier[phi0], psift = Fourier[psi0],
   dt = 0.4/nx^2,
   phift1 = 0, psift1 = 0, sol = 0},
  While[t < tmax,
   dt = Min[dt, tmax - t];
   phift1 = phift Cos[dt] + psift Sin[dt];
   psift1 = psift Cos[dt] - phift Sin[dt];
   phift = phift1;
   psift = psift1 - Fourier[Sin[InverseFourier[phift]]] dt;
   t += dt;
  sol = Re[InverseFourier[phift]];

To use it, do something like

sgs = sinegordonsol[1, data, 0*data];
ListLinePlot[Transpose[{xs, sgs}]]

The discrete Fourier transform in Mathematica

The Fourier transform is a very useful and widely used tool in many branches of maths. The continuous Fourier transform, available in Mathematica as FourierTransform, tries to perform the Fourier transform of a symbolic expression. This is useful for solving many problems if the Fourier transform can also be expressed in a nice symbolic form, but such a form is not necessarily available (and indeed the Fourier transform might not exist at all).

In practice, the discrete Fourier transform, which works with a finite set of data points and approximates the continuous Fourier transform, is more important. The DFT is available in Mathematica as Fourier, and the inverse DFT as InverseFourier. Note that Mathematica uses the 1/\sqrt{n} normalisation by default for both the DFT and the IDFT.


Here is an example usage of the DFT in Mathematica:

xmax = 1;
nx = 128;
dx = xmax / nx;
xs = Range[0, xmax - dx, dx];
f[x_] := Exp[Sin[6 Pi x]] + Sin[40 Pi x];
data = Map[f, xs];
ListLinePlot[Transpose[{xs, data}]]

dft = Fourier[data];
numax = 1/(2*dx);
dnu = 1/xmax;
nus = Join[Range[0, numax - dnu, dnu], Range[-numax, -dnu, dnu]];
Transpose[{nus, Abs[dft]}],
PlotRange -> All]

The first block generates a list data which consists of sampled values of the function f at the values xs, which are 128 values between 0 and 1 (not including the endpoint at 1). These sampled points are plotted.

The second plot calculates the discrete Fourier transform of data (according to the default normalisation). Then the list nus is constructed: these are the wavenumbers that correspond to each element of dft. (With the default normalisation, this is not the angular wavenumber: the wavenumber of \sin(40\pi x) is said to be 20, while the angular wavenumber is 40\pi. The symbol k is often used for the angular wavenumber; I am using \nu for the wavenumber.) We then plot the absolute values of dft against the wavenumbers nus.


Note how the values of \nu_{\max} and \Delta\nu are related to the values of \Delta x and x_{\max}:

\nu_{\max} = \frac{1}{2\Delta x} \text{ and } \Delta\nu = \frac{1}{x_{\max}}.

What this means is the following:

  • If you sample more densely (so \Delta x is smaller), then you can pick up higher frequencies.
  • If you sample for a longer domain (so x_{\max} is larger), then you can pick up lower frequencies.

The highest frequency that you can pick up, $1/2\Delta x$, is called the Nyquist frequency and is equal to half the sampling rate of the data. If your function in fact oscillates at a higher frequency than this, then your sampled data will not pick up that component properly, and many of your results will be inaccurate. This effect is known as aliasing.


The Fourier transform can be used to solve many different problems, and is particularly useful for solving linear partial differential equations. Numerical PDE solvers that make use of the DFT are part of a class of numerical schemes known as spectral methods. For linear problems where the solution is oscillatory or where a ‘conservation law’ (usually analogous to conservation of mass or momentum) applies, spectral methods tend to be better than finite differences. The DFT can also be useful for quasilinear PDEs, using a family of methods known as pseudospectral methods.