The idea that theories might be summarised by neat variational principles had been proposed since antiquity. Such theories are aesthetically pleasing in their simplicity, and in line with the principle of parsimony (or Occam’s razor).

However, there is a major difference between the above examples, and the principle of least action. In the above problems, the independent variables are *spatial*, and are concerned with a steady state. The principle of least action, which concerns the evolution of particle motions with respect to *time*, appears to require knowledge about the future. This is metaphysically troubling even today.

In the early 1600s, a number of scientists, including Willebrord Snellius in 1621, independently discovered an empirical relationship between the angles of incidence and refraction when a beam of light passes through a boundary of different materials, which we now know as Snell’s law. In a 1662 letter, Pierre de Fermat showed that, under certain assumptions about the speed of light in different media, then Snell’s law implies that the path taken by a ray between two given points is that of minimal travel time, and conversely, a ray that takes a path of minimal travel time obeys Snell’s law at the interface. Fermat’s argument, however, assumes that light travels slower in more dense media. We now know this to be true, but actual experimental evidence that light *in vacuo* travels at a finite speed was not available until 1676.

Fermat’s principle of minimal time was criticised by the prevalent Cartesian school on two grounds. Firstly, the above assumption about the speed of light was unjustified, and not compatible with René Descartes’ notions that that the speed of light *in vacuo* is infinite, and higher in dense media. (These are not necessarily contradictory statements: the mathematical machinery for comparing infinite or infinitesimal quantities was concurrently being developed, although Newton’s *Principia* was not yet published and the calculus would not be formalised for another century or two.) A more fundamental criticism of Fermat’s principle was that it is *teleological*: why does light ‘choose’ to take a time-minimising path, and ‘know’ how to find such a path in advance? Why should it ‘choose’ to minimise travel time and not some other quantity such as distance (which would give a straight line)? Claude Clerselier, a Cartesian critic of Fermat, wrote in reply:

… The principle which you take as the basis for your proof, namely that Nature always acts by using the simplest and shortest paths, is merely a moral, and not a physical one. It is not, and cannot be, the cause of any effect in Nature.

In other words, although Fermat’s principle was mathematically equivalent to Snell’s law, and supported by experiment, it was not considered a satisfactory description of a physical basis behind Snell’s law, as no physical mechanism had been offered.

Newton’s *Principia* was published in in 1687. After some initial controversy of their own, Newton’s ideas had become accepted by the time of Maupertuis and Euler. Newton’s formulation of particle mechanics, including the law of motion *F = ma* and the inverse square law for gravitation, gives a mathematical foundation for Kepler’s (empirical) laws of planetary motion.

An important development came in the 1740s with the development of the principle of least action by Pierre Louis Maupertuis and Leonhard Euler. Maupertuis defined action *S *as an ‘amount of motion’: for a single particle, action is momentum *mv* multiplied by the distance *s* travelled; for constant speed, *s = vt*, so the action is *S = mv*^{2}*t*. In the absence of a potential, this matches our modern definition of action, up to a factor of 2. (Maupertuis referred to the quantity *mv*^{2} as the *vis viva*, or ‘living force’, of the particle.) Studying the velocities of two colliding bodies before and after collision, Maupertuis showed that the law of conservation of momentum (by now well-established) is equivalent to the statement that the final velocities are such that the action of this process is minimised.

Euler is generally credited with inventing the calculus of variations in an early form, applying it to studying particle trajectories. (The modern form was later developed by Lagrange, his student, in 1755.) Euler generalised Maupertuis’ definition of action into the modern action integral, and included a new term for potential energy. He showed in 1744 that a particle subject to a central force (such as planetary motion) takes a path (calculated by Newton) that extremises this action, and *vice-versa*. Lagrange later showed more generally that the principle of least action is mathematically equivalent to Newton’s laws.

But why is this a sensible definition of action? In fact, *what is action*?

Maupertuis’ reasoning was that ‘Nature is thrifty in all its actions’, positing that action is a sort of ‘effort’. He was happy to attribute the principle of least action as some sort of God trying to minimise the effort of motions in the universe. But how does one know to choose this definition of action and not some other? As for refraction, why does one minimise travel time and not distance? Maupertuis argues that one *cannot* know to begin with, but that the correct functional needs to be identified.

Fermat and Euler took a rather weaker view, and refuse to make any metaphysical interpretations about their variational principles. Fermat stated that his principle is ‘a mathematical regularity from which the empirically correct law can be derived’ (Sklar 2012): this is an *aesthetic* statement about the theory, but says nothing about its origins.

Everyone agrees that the principle of least action is mathematically equivalent to Newton’s laws of motion, and both have equivalent status when compared against experiments. However, Newton’s laws are specified as differential equations with initial values (‘start in this state, and forward-march in time, with no memory about your past and no information about your future’). In contrast, the principle of least action is formulated as a boundary value problem (‘get from A to B in time *T*, accumulating as little action as possible’), governed by the Euler–Lagrange equations. Why are we less comfortable with the latter?

One reason is the question: Given that we are at the initial position A, how can we know that we will be at B after time *T*? This can be resolved by realising that when we solve the Euler–Lagrange equations, we have not been told what the initial velocity is, and have the freedom to choose it such that the final position will be B. Thus, one can convert between an IVP and a BVP: this is the approach taken with the shooting method for solving BVP numerically.

Another reason perhaps is cultural: most of us are taught Newtonian physics before Lagrangian physics. This is paedagogically reasonable: the Newtonian formulation requires far less mathematical machinery. There is also a technical reason for feeling more comfortable with describing physics through an IVP than a BVP: according to the Picard–Lindelöf theorem, an IVP is guaranteed to have a unique solution, at least for a finite domain; a similar guarantee cannot be made for a BVP.

The above essay has been guided by Lawrence Sklar’s book, *Philosophy and the Foundations of Dynamics*.

and I don’t care about second-order terms. The parentheses are there to indicate that these are term labels, not powers. But actually, there’s no need to have them, because if I ever need to raise something to the zeroth power, I can just write 1; and if I need to raise something to the first power, I don’t need to write the power at all. So, there’s no confusion at all by writing instead of ! If I need to square it, I can write . If I need to square , then I can write ; it’s unlikely I’ll need to take anything to the 12th power.

It’s an awful idea and a sane reviewer would reject it, but it does save time when LaTeXing…

]]>

*(Note: In this post, the terms ‘female’ and ‘male’ shall be shorthands for ‘people with XX chromosomes’ and ‘people with XY chromosomes’ respectively, since colourblindness is usually a chromosomal condition. I shall use the pronouns ‘she’ and ‘he’ as corresponding shorthands. *

*Chromosomal sex is distinct from other notions of sex or gender. Unfortunately, this usage may be misleading for transsexual and intersex people, as well as people with some chromosomal abnormalities. This problem is further discussed by Randall Munroe, author of the xkcd comics, here. As he says, ‘The role of gender in society is the most complicated thing I’ve ever spent a lot of time learning about, and I’ve spent a lot of time learning about quantum mechanics.’)*

Colourblindness is usually hereditary, caused by defects in the X chromosome that inhibit the proper formation of colour sensors in the eye. (There are also other rarer genetic defects which can cause colourblindness, as well as environmental factors, but we shan’t discuss them here.) Males are more likely than females to be colourblind, as they have only one copy of the X chromosome: the male will be colourblind if this copy is defective (we say that he has genotype xY, where the lower-case x denotes a defective allele, which is recessive). In contrast, a female has two copies of the X chromosome and will be colourblind only if both are defective (genotype xx). Epidemiologists have shown that, across the population, *p* = 0.08 = 8% of males, and *p*^{2} = 0.0064 = 0.64% of females, are colourblind. The means that if a male is sampled uniformly randomly from the male population, then the probability that he is colourblind is 8%.

The above not the same as saying ‘The probability that a particular male is colourblind is 8%.’ Suppose Thomas is a male who we know to be colourblind: then the probability that Thomas is colourblind is 100%! This demonstrates the need to distinguish between *prior probability *and *posterior probability*. Suppose Joanna is a female, sampled uniformly randomly from the female population. Without any further information about Joanna, we would say that the probability that Joanna is colourblind is 0.64%. This is the *prior probability*.

Suppose, however, we know that Joanna’s mother is colourblind, while her father is not. What is the probability that Joanna is colourblind? Joanna’s mother has genotype xx, which means that she must pass on a defective allele. However, Joanna’s father is not colourblind, so we know that he must have genotype XY. We know that Joanna is female, so her father must be passing on his X chromosome. Joanna’s genotype *must* therefore Xx, and she has a 0% probability of being colourblind! This is the *posterior probability*, which takes into account the information that we are given about Joanna’s parents.

The same is true of any female child of these parents. Similarly, any male child, then he must have genotype xY, and has a 100% probability of being colourblind! If they have a child of unspecified sex, then the probability that that child will be colourblind is 50%. This 50% is the probability *posterior *to us being told about the genotypes of the parents, but *prior* to us being told the sex of the child.

Now let’s return to the question at the top of this post. Two parents, Alice and Bob, neither of whom are colourblind, have two sons, both of whom are colourblind. ‘What are the odds?’ It depends on when you ask this question.

Before they have any children, all the information we know about Alice and Bob are their phenotypes, that is, the fact that they are not colourblind. This means that Bob’s genotype must be XY (if it were xY, then he would be colourblind), while Alice’s genotype may be XX, Xx or xX (it cannot be xx). What are the probabilities of each of these possibilities? *Since Alice’s genotype cannot be xx*, the probability of XX is , while the probability of Xx or xX, *i.e.* that she carries a defective allele, is .

Suppose they have their first child. If the child is female, then Bob must have passed on his X chromosome; the daughter must have a non-defective X chromosome, and has a probability 0 of being colourblind. If the child is male, then Bob must have passed on his Y chromosome, while Alice may have passed on either of her X chromosomes. The child will be colourblind if and only if Alice passes on a defective X chromosome. This happens with probability 0 if Alice’s genotype is XX, but with probability 0.5 if Alice carries a defective allele. The probability that a son will be colourblind is therefore . If the sex of the child is not known, the probability that the child is colourblind is . This is the *prior probability* that the child will be colourblind.

Alice and Bob have their first child, Charlie, a son who is tested and found to be colourblind. They then have a second son, David. What is the probability that David is colourblind? After Charlie’s diagnosis, we now know that Alice must carry a defective allele: her genotype must be either Xx or xX. She has a 50% chance of passing on this defective allele to David. *Given that his older brother is colourblind*, the probability that David is colourblind is as high as 50%! But if the parents have both Charlie and David before testing them at the same time, then the probability that Charlie and David are both diagnosed as colourblind is , or 1 in 1000, which is much lower!

The above demonstrates Bayesian probability, an interpretation of the concept of probability. The above probabilities are ‘degrees of belief’, or ‘how likely we think something is, based on the information that we have’. We start with *prior probabilities* based on already established data (such as the findings of the epidemiologists); when more information is given about a situation, we use this information to update our beliefs, obtaining *posterior probabilities* (which often use the phrase ‘given that’). The mathematical statement of how to relate prior and posterior probabilities is Bayes’ theorem.

Although the above calculations are all done mathematically, the probabilities can nonetheless be interpreted as being subjective: they reflect how certain we are of something. Indeed, they might represent ‘What payout odds would I accept if I’m making a bet on this?’.

*(Addendum: I’d forgotten that I wrote another piece on probability and eyes!)*

- Prince (2008) reports that pigs are approximately blue.
- Quail (2006), Quaffer (2008) and Qi (2009) use the approximation that pigs are blue.
- Rout (2012) is a review article discussing the aforementioned works.

Which of the following are valid?

- ‘Pigs are approximately blue (Prince 2008).’
- ‘Pigs are approximately blue (Quail 2006, Quaffer 2008, Qi 2009).’
- ‘We use the approximation that pigs are blue (Prince 2008).’
- ‘We use the approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
- ‘We use the widely-used approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
- ‘We use the widely-used approximation that pigs are blue (Rout 2012).’
- ‘The approximation that pigs are blue is widely used (Quail 2006, Quaffer 2008, Qi 2009).’
- ‘The approximation that pigs are blue is widely used (Rout 2012).’
- ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue.’
- ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue (Rout 2012).’

She was not okay. She was not waiting for a lift. She had been walking around Cambridge, lost, for the last two and a half hours. She was from a different city, having been on a hen night, having split off from the rest of the group, and trying to find her way back to the Travelodge. Her phone battery had run out. She was dressed up for going out; her shoes were not suitable for that much walking, and the cold weather was becoming increasingly unpleasant. (As it happened, the Travelodge was just down the road, about two minutes’ walk away.)

She was grateful, and told me that she had tried to stop and ask multiple people for directions, eliciting only ‘I don’t know, sorry’, rudeness, or abuse. Most horrifyingly, she had approached police officers for help and directions, but the police refused to help her, on the grounds that she was not a victim of a crime and therefore not in trouble.

Is their job to uphold the Law and to stop crime, or to serve the public and keep them safe?

]]>Then, I realised: Assuming that Egypt had existed for centuries before the time of Joseph, then successive Pharaohs might have appointed lots of people to be Viziers in this way, based solely on their abilities to make predictions based on individual dreams. Those who turned out to be wrong, or who were unable to enact the appropriate policies, were *disposed of* and their stories were not recorded and have not been passed down to us.

These are admittedly only a couple of examples, so I may be going a bit far, but nonetheless, I claim that the following proverbs are true:

*C0: For any proverb P, ‘P is a Chinese proverb’ is a proverb.**C1: For any proverb P, P is not a Chinese proverb if and only if P is claimed to be a Chinese proverb.*

Since it is unnecessary for the Chinese to claim that a statement is a *Chinese* proverb (we need merely claim it to be a *proverb*), I make also the following claim:

*C2: For any proverb P, ‘P is a Chinese proverb’ is not a Chinese proverb.*

Can these claims be consistent, and which (if any) can I consistently claim to be Chinese proverbs?

**Addendum:** Oftentimes, the claim that a proverb is Chinese is used by orientalist woo-peddlers to create credence for their claims. Allow me therefore to go so far as to claim:

*C3: For any proverb P, if P is claimed to be a Chinese proverb then P is false.*

Is this consistent?

]]>One thing that’s not immediately obvious is how you refer to things. On Twitter, you can refer to people as `@jftsang` and to groups as `#example`. On IRC networks, channels are usually called `#channel` or `##unofficialchannel`.

Well, on Matrix, user IDs take the form `@username:server`, such as `@jftsang:matrix.org`. The latter part tells you about the *homeserver* of the user, which is needed because Matrix is a distributed network and different users might be accessing through different servers. Rooms take the form `#room:server`, and *communities* take the form `+room:server`. I’m not yet sure what the relationship between rooms and communities is.

(*) Ten years of relying almost exclusively on Facebook means that we tend not to have many of our friends’ email addresses. The situation was particularly bad when Facebook tried pushing their @facebook.com email addresses, which fortunately didn’t catch on.

I would recommend anyone interested in a free-as-in-speech-*and*-as-in-beer IM service to try this out; send me a message on @jftsang:matrix.org on Matrix, or giving me your email address so that I may invite you.

In my work I look at cups full of hundreds of hundreds of hundreds of hundreds of tiny little bits, such as little glass balls, or the hard white stuff that you can put on your food. When you try to move these bodies made up of little bits, such as when you move them from one cup to another, they act a little bit like water but not quite in the same way. We understand how water moves from one cup to another quite well, but it’s much harder when you have these little bits because they can do a lot of different things that water can’t do. It’s also harder to learn about how these little bits move because they are usually not see-through, so that you can’t look inside the body and see how it’s moving under the top.

I want to understand how these bodies made up of tiny little bits move when you make them go over things. It’s hard to try it out in real life because the bits are not see-through, so instead I use a computer where I can play with pretend tiny little bits. This lets me look at how each little bit moves. It’s also much cleaner to do it on the computer and I can try many different set-ups at the same time, but you have to make sure that the pretend little bits are like real ones.

It’s important to study how stuff made up of tiny little bits go over things, because sometimes too much of this stuff can move at the same time, very quickly, over houses. This happens quite often and a lot of people get hurt. It’s also interesting because people also use a lot of this stuff to build things and we need to know what’s the best way of moving it.

]]>[19] [20] [21] (./blasius.bbl Underfull \hbox (badness 1210) in paragraph at lines 13--15 []\OT1/cmr/m/sc/9 Andreotti, Bruno, Forterre, Yo[]el & Pouliquen, Oliver \OT1/c mr/m/n/9 2013 \OT1/cmr/m/it/9 Gran-u-lar Me-dia\OT1/cmr/m/n/9 . [22] Underfull \hbox (badness 6396) in paragraph at lines 156--158 []\OT1/cmr/m/sc/9 Peregrine, D. H. \OT1/cmr/m/n/9 1967 Long waves on a beach. \ OT1/cmr/m/it/9 Jour-nal of Fluid Me-chan-ics Underfull \hbox (badness 5954) in paragraph at lines 180--184 []\OT1/cmr/m/sc/9 Rajchenbach, Jean \OT1/cmr/m/n/9 2005 Rhe-ol-ogy of dense gra n-u-lar ma-te-ri-als: steady, uni-form Underfull \hbox (badness 10000) in paragraph at lines 180--184 \OT1/cmr/m/n/9 flow and the avalanche regime. \OT1/cmr/m/it/9 Jour-nal of Physi cs: Con-densed Mat-ter [23]) [24] (./blasius.aux)]]>