Permutation-based career

My PhD thesis ended up being largely about discrete particle modelling.

I’ve recently started working in a team of developers working on platforms and data management.

I’m looking forward to my future jobs working on disease management programmes, manic depressive psychosis, the Pakistan Meteorological Department, and the Mouse Phenome Database (which is so cool).

Historical and philosophical contexts of the calculus of variations

The calculus of variations is concerned with finding functions that extremise (maximise or minimise) a particular quantity. A classic example is the catenary problem. What shape does a chain take when hung between two points? It is the unique shape that minimises the potential energy of the chain; and such a shape is called a catenary, and is given by the cosh function. The idea that the potential energy of a hanging chain should be minimised is a variational principle. Another example of a variational principle is the notion that a soap bubble or water balloon should have a shape that has a minimal surface area, namely a sphere. The variational principles in both examples predict the same shapes as those that one would find by constructing force-balance arguments on line or surface elements, but the variational formulations are far simpler to describe and implement.

The idea that theories might be summarised by neat variational principles had been proposed since antiquity. Such theories are aesthetically pleasing in their simplicity, and in line with the principle of parsimony (or Occam’s razor).

However, there is a major difference between the above examples, and the principle of least action. In the above problems, the independent variables are spatial, and are concerned with a steady state. The principle of least action, which concerns the evolution of particle motions with respect to time, appears to require knowledge about the future. This is metaphysically troubling even today.

Optics and Fermat’s principle

In the early 1600s, a number of scientists, including Willebrord Snellius in 1621, independently discovered an empirical relationship between the angles of incidence and refraction when a beam of light passes through a boundary of different materials, which we now know as Snell’s law. In a 1662 letter, Pierre de Fermat showed that, under certain assumptions about the speed of light in different media, then Snell’s law implies that the path taken by a ray between two given points is that of minimal travel time, and conversely, a ray that takes a path of minimal travel time obeys Snell’s law at the interface. Fermat’s argument, however, assumes that light travels slower in more dense media. We now know this to be true, but actual experimental evidence that light in vacuo travels at a finite speed was not available until 1676.

Fermat’s principle of minimal time was criticised by the prevalent Cartesian school on two grounds. Firstly, the above assumption about the speed of light was unjustified, and not compatible with René Descartes’ notions that that the speed of light in vacuo is infinite, and higher in dense media. (These are not necessarily contradictory statements: the mathematical machinery for comparing infinite or infinitesimal quantities was concurrently being developed, although Newton’s Principia was not yet published and the calculus would not be formalised for another century or two.) A more fundamental criticism of Fermat’s principle was that it is teleological: why does light ‘choose’ to take a time-minimising path, and ‘know’ how to find such a path in advance? Why should it ‘choose’ to minimise travel time and not some other quantity such as distance (which would give a straight line)? Claude Clerselier, a Cartesian critic of Fermat, wrote in reply:

… The principle which you take as the basis for your proof, namely that Nature always acts by using the simplest and shortest paths, is merely a moral, and not a physical one. It is not, and cannot be, the cause of any effect in Nature.

In other words, although Fermat’s principle was mathematically equivalent to Snell’s law, and supported by experiment, it was not considered a satisfactory description of a physical basis behind Snell’s law, as no physical mechanism had been offered.

Particle mechanics and the principle of least action

Newton’s Principia was published in in 1687. After some initial controversy of their own, Newton’s ideas had become accepted by the time of Maupertuis and Euler. Newton’s formulation of particle mechanics, including the law of motion F = ma and the inverse square law for gravitation, gives a mathematical foundation for Kepler’s (empirical) laws of planetary motion.

An important development came in the 1740s with the development of the principle of least action by Pierre Louis Maupertuis and Leonhard Euler. Maupertuis defined action S as an ‘amount of motion’: for a single particle, action is momentum mv multiplied by the distance s travelled; for constant speed, s = vt, so the action is S = mv2t. In the absence of a potential, this matches our modern definition of action, up to a factor of 2. (Maupertuis referred to the quantity mv2 as the vis viva, or ‘living force’, of the particle.) Studying the velocities of two colliding bodies before and after collision, Maupertuis showed that the law of conservation of momentum (by now well-established) is equivalent to the statement that the final velocities are such that the action of this process is minimised.

Euler is generally credited with inventing the calculus of variations in an early form, applying it to studying particle trajectories. (The modern form was later developed by Lagrange, his student, in 1755.) Euler generalised Maupertuis’ definition of action into the modern action integral, and included a new term for potential energy. He showed in 1744 that a particle subject to a central force (such as planetary motion) takes a path (calculated by Newton) that extremises this action, and vice-versa. Lagrange later showed more generally that the principle of least action is mathematically equivalent to Newton’s laws.

But why is this a sensible definition of action? In fact, what is action?

Maupertuis’ reasoning was that ‘Nature is thrifty in all its actions’, positing that action is a sort of ‘effort’. He was happy to attribute the principle of least action as some sort of God trying to minimise the effort of motions in the  universe. But how does one know to choose this definition of action and not some other? As for refraction, why does one minimise travel time and not distance? Maupertuis argues that one cannot know to begin with, but that the correct functional needs to be identified.

Fermat and Euler took a rather weaker view, and refuse to make any metaphysical interpretations about their variational principles. Fermat stated that his principle is ‘a mathematical regularity from which the empirically correct law can be derived’ (Sklar 2012): this is an aesthetic statement about the theory, but says nothing about its origins.

Why do we find the principle of least action problematic?

Everyone agrees that the principle of least action is mathematically equivalent to Newton’s laws of motion, and both have equivalent status when compared against experiments. However, Newton’s laws are specified as differential equations with initial values (‘start in this state, and forward-march in time, with no memory about your past and no information about your future’). In contrast, the principle of least action is formulated as a boundary value problem (‘get from A to B in time T, accumulating as little action as possible’), governed by the Euler–Lagrange equations. Why are we less comfortable with the latter?

One reason is the question: Given that we are at the initial position A, how can we know that we will be at B after time T? This can be resolved by realising that when we solve the Euler–Lagrange equations, we have not been told what the initial velocity is, and have the freedom to choose it such that the final position will be B. Thus, one can convert between an IVP and a BVP: this is the approach taken with the shooting method for solving BVP numerically.

Another reason perhaps is cultural: most of us are taught Newtonian physics before Lagrangian physics. This is paedagogically reasonable: the Newtonian formulation requires far less mathematical machinery. There is also a technical reason for feeling more comfortable with describing physics through an IVP than a BVP: according to the Picard–Lindelöf theorem, an IVP is guaranteed to have a unique solution, at least for a finite domain; a similar guarantee cannot be made for a BVP.

Acknowledgements

The above essay has been guided by Lawrence Sklar’s book, Philosophy and the Foundations of Dynamics.

Type inference for lazy LaTeXing

I am doing some work with asymptotic expansions of the form

$h = h^{(0)} + \epsilon h^{(1)} + O(\epsilon^2)$

and I don’t care about second-order terms. The parentheses are there to indicate that these are term labels, not powers. But actually, there’s no need to have them, because if I ever need to raise something to the zeroth power, I can just write 1; and if I need to raise something to the first power, I don’t need to write the power at all. So, there’s no confusion at all by writing $h^0$ instead of $h^{(0)}$! If I need to square it, I can write $h^{02}$. If I need to square $h^{(1)}$, then I can write $h^{12}$; it’s unlikely I’ll need to take anything to the 12th power.

It’s an awful idea and a sane reviewer would reject it, but it does save time when LaTeXing…

Primary, secondary and ternary sources

I am a bit annoyed that scientists don’t always seem to get the difference between primary, secondary and tertiary sources. Consider this situation:

• Prince (2008) reports that pigs are approximately blue.
• Quail (2006), Quaffer (2008) and Qi (2009) use the approximation that pigs are blue.
• Rout (2012) is a review article discussing the aforementioned works.

Which of the following are valid?

1. ‘Pigs are approximately blue (Prince 2008).’
2. ‘Pigs are approximately blue (Quail 2006, Quaffer 2008, Qi 2009).’
3. ‘We use the approximation that pigs are blue (Prince 2008).’
4. ‘We use the approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
5. ‘We use the widely-used approximation that pigs are blue (Quail 2006, Quaffer 2008, Qi 2009).’
6. ‘We use the widely-used approximation that pigs are blue (Rout 2012).’
7. ‘The approximation that pigs are blue is widely used (Quail 2006, Quaffer 2008, Qi 2009).’
8. ‘The approximation that pigs are blue is widely used (Rout 2012).’
9. ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue.’
10. ‘Many authors, including Quail (2006), Quaffer (2008) and Qi (2009), use the approximation that pigs are blue (Rout 2012).’

Reporting biases in Genesis and Andrew Lloyd Webber

It’s always bugged me how in Joseph and the Amazing Technicolor Dreamcoat Pharaoh hires Joseph rather arbitrarily to be his Vizier responsible for Egypt’s economic policies over the next fourteen years, based solely on Joseph’s (as yet unproven) ability to explain his dreams. Even if Joseph’s forecasting was accurate, as a lowly foreign-born slave-turn-prisoner would he have had the administrative skills to oversee such huge reforms?

Then, I realised: Assuming that Egypt had existed for centuries before the time of Joseph, then successive Pharaohs might have appointed lots of people to be Viziers in this way, based solely on their abilities to make predictions based on individual dreams. Those who turned out to be wrong, or who were unable to enact the appropriate policies, were disposed of and their stories were not recorded and have not been passed down to us.

Explaining my PhD using the ten hundred most used words

You can try this yourself at http://splasho.com/upgoer5/.

In my work I look at cups full of hundreds of hundreds of hundreds of hundreds of tiny little bits, such as little glass balls, or the hard white stuff that you can put on your food. When you try to move these bodies made up of little bits, such as when you move them from one cup to another, they act a little bit like water but not quite in the same way. We understand how water moves from one cup to another quite well, but it’s much harder when you have these little bits because they can do a lot of different things that water can’t do. It’s also harder to learn about how these little bits move because they are usually not see-through, so that you can’t look inside the body and see how it’s moving under the top.

I want to understand how these bodies made up of tiny little bits move when you make them go over things. It’s hard to try it out in real life because the bits are not see-through, so instead I use a computer where I can play with pretend tiny little bits. This lets me look at how each little bit moves. It’s also much cleaner to do it on the computer and I can try many different set-ups at the same time, but you have to make sure that the pretend little bits are like real ones.

It’s important to study how stuff made up of tiny little bits go over things, because sometimes too much of this stuff can move at the same time, very quickly, over houses. This happens quite often and a lot of people get hurt. It’s also interesting because people also use a lot of this stuff to build things and we need to know what’s the best way of moving it.

Retinal detachment and Bayes’ theorem

I had my eyes tested yesterday, having put it off for several years. Happily, my vision seems not to have deteriorated in the last couple of years.

After the test, the optometrist told me that my short-sightedness meant that I was at risk of retinal detachment (RD). I asked if this was something to be worried about on a day-to-day basis. They said no, it was just something to be aware of: retinal detachment affects about 1 in 10,000 people, but 40% of cases happen in people with severe myopia.

I didn’t feel very comforted by this, since this information doesn’t directly tell you about my personal risk of retinal detachment given that I have severe myopia. To make sense of that figure, you need to know the prevalence of severe myopia.

According to Haimann et al. (1982) and Larkin (2006), the figure of 1 in 10,000 is actually an annual incidence: in a population of 10,000 healthy people, on average one new case of RD will develop after a year; the lifetime risk is therefore about 1 in 300. The prevalence of severe myopia (beyond −5 diopters) amongst Western Europeans aged 40 or over is about 4.6% (Kempen et al. 2004).

A calculation using Bayes’ theorem would predict that RD has an incidence, amongst people (Western Europeans aged 40 or over) with severe myopia, of about 1 in 1,000 per year, which corresponds to a lifetime risk of about 1 in 30.

This lifetime risk is surprisingly high, and not nearly as comforting as ‘1 in 10,000’. It is so much higher than the base incidence because severe myopia is fairly uncommon, and also because people live quite long lives; the exact relationship between lifetime risk and annual incidence depends on one’s lifespan, and the incidence is not uniform with age. Fortunately, the annual incidence of 1 in 1,000 is still quite small, so no, it’s not something to worry about every day.

This is an extremely simplified calculation using figures drawn from across different populations; the Haimann study was for Iowans of all ages. Myopia is much more common in China, but it’s unlikely that there’s any data out there specifically on Chinese ethnicity people living in Western Europe (both genetics and environment affect myopia). I’ve been unable to find any more detailed information on the prevalence of retinal detachment as a function of myopia strength.

Gariano and Kim (2004) describe the mechanism by which severe myopia might cause retinal detachment.

TL;DR: Opticians don’t understand conditional probabilities, causing me to stay up late browsing optometry and epidemiology papers.

APS DFD 2017

I’ve spent the last few days in Denver, Colorado, where I attended the American Physical Society’s annual fluid dynamics conference and am staying for a few more days. I’ve been staying at an excellent hostel, with very welcoming staff who even organised what was my first Thanksgiving dinner. The usual stresses of travelling and conference preparation aside, this has been a very enjoyable trip.

In defence of anecdotal evidence

Anecdotal evidence is worthless, right? It comes about through uncontrolled conditions, and the people reporting it may report selectively (whether or not they intend to be biased). Thanks in part to the works of writers such as Richard Dawkins, we have learnt to dismiss anecdotes and personal testimonials, bringing us closer towards a world governed by Reason and statistics. And we can consign anything supported merely by anecdotes to fire. Hurrah!

For many things in the natural world, it is relatively straightforward (if expensive) to isolate the thing to be tested, conduct experiments or controlled trials, and then quantify the effect of that thing, with well-defined error bars. There are well-established principles and procedures for designing clinical trials, which is why we can resolutely label things like homeopathy, claims about the MMR vaccine, and everything Deepak Chopra says as bullshit, even if there are occasional success stories.

But – and perhaps Dawkins and co. haven’t realised this yet – humans are complicated, and social phenomena, which involve multiple humans, are very complicated. It is impossible to control the environment in which they arise. Also, individual experiences are unique, and it is difficult to give meaningful definitions or boundaries (see also this post), and to ensure that everybody uses the same definition. Related to this, people do not always accurately report their experiences: something that is perceived to be ‘shameful’ will be underreported even if it is actually quite common.

For these reasons, many social phenomena have not been studied quantitatively. But

Absence of evidence is not evidence of absence.

If anything, it is evidence that you haven’t yet done a good enough job collecting evidence on the subject.

When nothing else is available, and when it is not possible to conduct a systematic, controlled and quantitative study, then anecdotal evidence is the best you can do, and it needs to be taken into account, provided it comes from a credible reporter, who has no vested interests. And you must hear that evidence, even if you do not give it much weight.

In more technical language, I am arguing that probabilities are subjective measures of a degree of belief, not objective, and that any evidence should update your posterior probability, even if not by very much.

What I’ve said so far has been relatively abstract, but a failure to understand this has truly harmful effects when we dismiss anecdotal evidence. When hundreds of people report that they have been victims of something, then we need to start taking their testimonials seriously.

The Everyday Sexism Project has collected reports from tens of thousands of women about the sexist abuses that they have suffered. These are idiosyncratic and can’t be categorised; they might have happened repeatedly over a long time, or be one-off events. These acts are often not visible: even the person doing or saying the sexist things might not realise that they are being hostile. An individual claim of sexism might be dismissed by suggesting a variety of mitigating circumstances, or even by assuming bad faith on the part of the reporter! But what is more likely: that misogyny exists in our society, or that thousands of women have conspired together to make up that myth? (You may find Occam’s razor useful.)

Everyday sexism is just one example of microaggression, which also happens in other contexts such as race and religion. Moreover the fear of being subject to a racist attack is just as relevant as the number of actual attacks. Fear has a chilling effect on society, and has a measurable effect on the economy, but by its very nature it is difficult to measure.

Other examples include people’s testimonials of an NHS (or other public service) that is unable to provide a good experience. When thousands of people across the country complain about this, then it is no longer an egotistic individual or a problematic local service; there is something nationwide happening.

In summary,

When thousands of anecdotes are given, then it is no longer "merely" anecdotal evidence.

As with the etymological fallacy, the failure to give anecdotal evidence the weight that it sometimes deserves is a dangerous fallacy, because it is easy to commit it, thinking that you are rational and your opponent is not. This arrogant attitude poisons a discussion.