Historical and philosophical contexts of the calculus of variations

The calculus of variations is concerned with finding functions that extremise (maximise or minimise) a particular quantity. A classic example is the catenary problem. What shape does a chain take when hung between two points? It is the unique shape that minimises the potential energy of the chain; and such a shape is called a catenary, and is given by the cosh function. The idea that the potential energy of a hanging chain should be minimised is a variational principle. Another example of a variational principle is the notion that a soap bubble or water balloon should have a shape that has a minimal surface area, namely a sphere. The variational principles in both examples predict the same shapes as those that one would find by constructing force-balance arguments on line or surface elements, but the variational formulations are far simpler to describe and implement.

The idea that theories might be summarised by neat variational principles had been proposed since antiquity. Such theories are aesthetically pleasing in their simplicity, and in line with the principle of parsimony (or Occam’s razor).

However, there is a major difference between the above examples, and the principle of least action. In the above problems, the independent variables are spatial, and are concerned with a steady state. The principle of least action, which concerns the evolution of particle motions with respect to time, appears to require knowledge about the future. This is metaphysically troubling even today.

Optics and Fermat’s principle

In the early 1600s, a number of scientists, including Willebrord Snellius in 1621, independently discovered an empirical relationship between the angles of incidence and refraction when a beam of light passes through a boundary of different materials, which we now know as Snell’s law. In a 1662 letter, Pierre de Fermat showed that, under certain assumptions about the speed of light in different media, then Snell’s law implies that the path taken by a ray between two given points is that of minimal travel time, and conversely, a ray that takes a path of minimal travel time obeys Snell’s law at the interface. Fermat’s argument, however, assumes that light travels slower in more dense media. We now know this to be true, but actual experimental evidence that light in vacuo travels at a finite speed was not available until 1676.

Fermat’s principle of minimal time was criticised by the prevalent Cartesian school on two grounds. Firstly, the above assumption about the speed of light was unjustified, and not compatible with René Descartes’ notions that that the speed of light in vacuo is infinite, and higher in dense media. (These are not necessarily contradictory statements: the mathematical machinery for comparing infinite or infinitesimal quantities was concurrently being developed, although Newton’s Principia was not yet published and the calculus would not be formalised for another century or two.) A more fundamental criticism of Fermat’s principle was that it is teleological: why does light ‘choose’ to take a time-minimising path, and ‘know’ how to find such a path in advance? Why should it ‘choose’ to minimise travel time and not some other quantity such as distance (which would give a straight line)? Claude Clerselier, a Cartesian critic of Fermat, wrote in reply:

… The principle which you take as the basis for your proof, namely that Nature always acts by using the simplest and shortest paths, is merely a moral, and not a physical one. It is not, and cannot be, the cause of any effect in Nature.

In other words, although Fermat’s principle was mathematically equivalent to Snell’s law, and supported by experiment, it was not considered a satisfactory description of a physical basis behind Snell’s law, as no physical mechanism had been offered.

Particle mechanics and the principle of least action

Newton’s Principia was published in in 1687. After some initial controversy of their own, Newton’s ideas had become accepted by the time of Maupertuis and Euler. Newton’s formulation of particle mechanics, including the law of motion F = ma and the inverse square law for gravitation, gives a mathematical foundation for Kepler’s (empirical) laws of planetary motion.

An important development came in the 1740s with the development of the principle of least action by Pierre Louis Maupertuis and Leonhard Euler. Maupertuis defined action S as an ‘amount of motion’: for a single particle, action is momentum mv multiplied by the distance s travelled; for constant speed, s = vt, so the action is S = mv2t. In the absence of a potential, this matches our modern definition of action, up to a factor of 2. (Maupertuis referred to the quantity mv2 as the vis viva, or ‘living force’, of the particle.) Studying the velocities of two colliding bodies before and after collision, Maupertuis showed that the law of conservation of momentum (by now well-established) is equivalent to the statement that the final velocities are such that the action of this process is minimised.

Euler is generally credited with inventing the calculus of variations in an early form, applying it to studying particle trajectories. (The modern form was later developed by Lagrange, his student, in 1755.) Euler generalised Maupertuis’ definition of action into the modern action integral, and included a new term for potential energy. He showed in 1744 that a particle subject to a central force (such as planetary motion) takes a path (calculated by Newton) that extremises this action, and vice-versa. Lagrange later showed more generally that the principle of least action is mathematically equivalent to Newton’s laws.

But why is this a sensible definition of action? In fact, what is action?

Maupertuis’ reasoning was that ‘Nature is thrifty in all its actions’, positing that action is a sort of ‘effort’. He was happy to attribute the principle of least action as some sort of God trying to minimise the effort of motions in the  universe. But how does one know to choose this definition of action and not some other? As for refraction, why does one minimise travel time and not distance? Maupertuis argues that one cannot know to begin with, but that the correct functional needs to be identified.

Fermat and Euler took a rather weaker view, and refuse to make any metaphysical interpretations about their variational principles. Fermat stated that his principle is ‘a mathematical regularity from which the empirically correct law can be derived’ (Sklar 2012): this is an aesthetic statement about the theory, but says nothing about its origins.

Why do we find the principle of least action problematic?

Everyone agrees that the principle of least action is mathematically equivalent to Newton’s laws of motion, and both have equivalent status when compared against experiments. However, Newton’s laws are specified as differential equations with initial values (‘start in this state, and forward-march in time, with no memory about your past and no information about your future’). In contrast, the principle of least action is formulated as a boundary value problem (‘get from A to B in time T, accumulating as little action as possible’), governed by the Euler–Lagrange equations. Why are we less comfortable with the latter?

One reason is the question: Given that we are at the initial position A, how can we know that we will be at B after time T? This can be resolved by realising that when we solve the Euler–Lagrange equations, we have not been told what the initial velocity is, and have the freedom to choose it such that the final position will be B. Thus, one can convert between an IVP and a BVP: this is the approach taken with the shooting method for solving BVP numerically.

Another reason perhaps is cultural: most of us are taught Newtonian physics before Lagrangian physics. This is paedagogically reasonable: the Newtonian formulation requires far less mathematical machinery. There is also a technical reason for feeling more comfortable with describing physics through an IVP than a BVP: according to the Picard–Lindelöf theorem, an IVP is guaranteed to have a unique solution, at least for a finite domain; a similar guarantee cannot be made for a BVP.


The above essay has been guided by Lawrence Sklar’s book, Philosophy and the Foundations of Dynamics.

Type inference for lazy LaTeXing

I am doing some work with asymptotic expansions of the form

 h = h^{(0)} + \epsilon h^{(1)} + O(\epsilon^2)

and I don’t care about second-order terms. The parentheses are there to indicate that these are term labels, not powers. But actually, there’s no need to have them, because if I ever need to raise something to the zeroth power, I can just write 1; and if I need to raise something to the first power, I don’t need to write the power at all. So, there’s no confusion at all by writing h^0 instead of h^{(0)} ! If I need to square it, I can write h^{02}. If I need to square h^{(1)}, then I can write h^{12}; it’s unlikely I’ll need to take anything to the 12th power.

It’s an awful idea and a sane reviewer would reject it, but it does save time when LaTeXing…

Colourblindness and probability

A female acquaintance of mine was recently surprised to find that both of her sons were colourblind, despite neither parent being colourblind. A natural question to ask is ‘What are the odds?’ This question turns out to be open to interpretation, depending on what we mean by probability and odds.

Continue reading Colourblindness and probability

Primary, secondary and ternary sources

I am a bit annoyed that scientists don’t always seem to get the difference between primary, secondary and tertiary sources. Consider this situation:

  • Prince (2008) reports that pigs are approximately blue.
  • Quail (2006), Quaffer (2008) and Qi (2009) use the approximation that pigs are blue.
  • Rout (2012) is a review article discussing the aforementioned works.

Which of the following are valid?

  1. ‘Pigs are approximately blue (Prince 2008).’
  2. ‘Pigs are approximately blue (Quail 2006, Quaffer 2008, Qi 2009).’
  3. ‘We use the approximation that pigs are blue (Prince 2008).’
  4. ‘We use the approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
  5. ‘We use the widely-used approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
  6. ‘We use the widely-used approximation that pigs are blue (Rout 2012).’
  7. ‘The approximation that pigs are blue is widely used (Quail 2006, Quaffer 2008, Qi 2009).’
  8. ‘The approximation that pigs are blue is widely used (Rout 2012).’
  9. ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue.’
  10. ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue (Rout 2012).’

The police as the modern priest-Levite

Last weekend, I was cycling home at around 1am and was going through the East Road–Newmarket Road–Elizabeth Way roundabout, when I went past a woman who looked like she was waiting for a lift. However, the side of a major road away from any houses seemed like an odd place to wait for a lift, and she seemed distressed, so I stopped and asked if she was okay.

She was not okay. She was not waiting for a lift. She had been walking around Cambridge, lost, for the last two and a half hours. She was from a different city, having been on a hen night, having split off from the rest of the group, and trying to find her way back to the Travelodge. Her phone battery had run out. She was dressed up for going out; her shoes were not suitable for that much walking, and the cold weather was becoming increasingly unpleasant. (As it happened, the Travelodge was just down the road, about two minutes’ walk away.)

She was grateful, and told me that she had tried to stop and ask multiple people for directions, eliciting only ‘I don’t know, sorry’, rudeness, or abuse. Most horrifyingly, she had approached police officers for help and directions, but the police refused to help her, on the grounds that she was not a victim of a crime and therefore not in trouble.

Is their job to uphold the Law and to stop crime, or to serve the public and keep them safe?

Reporting biases in Genesis and Andrew Lloyd Webber

It’s always bugged me how in Joseph and the Amazing Technicolor Dreamcoat Pharaoh hires Joseph rather arbitrarily to be his Vizier responsible for Egypt’s economic policies over the next fourteen years, based solely on Joseph’s (as yet unproven) ability to explain his dreams. Even if Joseph’s forecasting was accurate, as a lowly foreign-born slave-turn-prisoner would he have had the administrative skills to oversee such huge reforms?

Then, I realised: Assuming that Egypt had existed for centuries before the time of Joseph, then successive Pharaohs might have appointed lots of people to be Viziers in this way, based solely on their abilities to make predictions based on individual dreams. Those who turned out to be wrong, or who were unable to enact the appropriate policies, were disposed of and their stories were not recorded and have not been passed down to us.

Chinese proverbs

I’ve noticed an annoying and persistent tendency for people to inaccurately claim that certain sayings are Chinese proverbs. ‘Give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime’ is one example of such. ‘A picture is worth a thousand words’ is another.

These are admittedly only a couple of examples, so I may be going a bit far, but nonetheless, I claim that the following proverbs are true:

  • C0: For any proverb P, ‘P is a Chinese proverb’ is a proverb.
  • C1: For any proverb P, P is not a Chinese proverb if and only if P is claimed to be a Chinese proverb.

Since it is unnecessary for the Chinese to claim that a statement is a Chinese proverb (we need merely claim it to be a proverb), I make also the following claim:

  • C2: For any proverb P, ‘P is a Chinese proverb’ is not a Chinese proverb.

Can these claims be consistent, and which (if any) can I consistently claim to be Chinese proverbs?

Addendum: Oftentimes, the claim that a proverb is Chinese is used by orientalist woo-peddlers to create credence for their claims. Allow me therefore to go so far as to claim:

  • C3: For any proverb P, if P is claimed to be a Chinese proverb then P is false.

Is this consistent?

Using Matrix instant messaging

Following my recent rant on decentralising our communications, I’ve started trying out the Matrix communication protocol on the suggestion of a friend. It’s a wonderful idea, and it’s great that the network can be connected to by various different clients. And it seems to be very easy to add people to the network: you just need to give their email address(*) to invite them. The Riot.im is quite easy to use for basic usage, although there are some nuances that I haven’t got used to yet.

One thing that’s not immediately obvious is how you refer to things. On Twitter, you can refer to people as @jftsang and to groups as #example. On IRC networks, channels are usually called #channel or ##unofficialchannel.

Well, on Matrix, user IDs take the form @username:server, such as @jftsang:matrix.org. The latter part tells you about the homeserver of the user, which is needed because Matrix is a distributed network and different users might be accessing through different servers. Rooms take the form #room:server, and communities take the form +room:server. I’m not yet sure what the relationship between rooms and communities is.

(*) Ten years of relying almost exclusively on Facebook means that we tend not to have many of our friends’ email addresses. The situation was particularly bad when Facebook tried pushing their @facebook.com email addresses, which fortunately didn’t catch on.

I would recommend anyone interested in a free-as-in-speech-and-as-in-beer IM service to try this out; send me a message on @jftsang:matrix.org on Matrix, or giving me your email address so that I may invite you.

Explaining my PhD using the ten hundred most used words

You can try this yourself at http://splasho.com/upgoer5/.

In my work I look at cups full of hundreds of hundreds of hundreds of hundreds of tiny little bits, such as little glass balls, or the hard white stuff that you can put on your food. When you try to move these bodies made up of little bits, such as when you move them from one cup to another, they act a little bit like water but not quite in the same way. We understand how water moves from one cup to another quite well, but it’s much harder when you have these little bits because they can do a lot of different things that water can’t do. It’s also harder to learn about how these little bits move because they are usually not see-through, so that you can’t look inside the body and see how it’s moving under the top.

I want to understand how these bodies made up of tiny little bits move when you make them go over things. It’s hard to try it out in real life because the bits are not see-through, so instead I use a computer where I can play with pretend tiny little bits. This lets me look at how each little bit moves. It’s also much cleaner to do it on the computer and I can try many different set-ups at the same time, but you have to make sure that the pretend little bits are like real ones.

It’s important to study how stuff made up of tiny little bits go over things, because sometimes too much of this stuff can move at the same time, very quickly, over houses. This happens quite often and a lot of people get hurt. It’s also interesting because people also use a lot of this stuff to build things and we need to know what’s the best way of moving it.

The LaTeX psalm chant

LaTeX’s output, showing its hyphenation algorithms at work, makes me want to set my bibliography to plainchant:

[19] [20] [21] (./blasius.bbl
Underfull \hbox (badness 1210) in paragraph at lines 13--15
[]\OT1/cmr/m/sc/9 Andreotti, Bruno, Forterre, Yo[]el & Pouliquen, Oliver \OT1/c
mr/m/n/9 2013 \OT1/cmr/m/it/9 Gran-u-lar Me-dia\OT1/cmr/m/n/9 .
Underfull \hbox (badness 6396) in paragraph at lines 156--158
[]\OT1/cmr/m/sc/9 Peregrine, D. H. \OT1/cmr/m/n/9 1967 Long waves on a beach. \
OT1/cmr/m/it/9 Jour-nal of Fluid Me-chan-ics

Underfull \hbox (badness 5954) in paragraph at lines 180--184
[]\OT1/cmr/m/sc/9 Rajchenbach, Jean \OT1/cmr/m/n/9 2005 Rhe-ol-ogy of dense gra
n-u-lar ma-te-ri-als: steady, uni-form

Underfull \hbox (badness 10000) in paragraph at lines 180--184
\OT1/cmr/m/n/9 flow and the avalanche regime. \OT1/cmr/m/it/9 Jour-nal of Physi
cs: Con-densed Mat-ter
[23]) [24] (./blasius.aux)