Historical and philosophical contexts of the calculus of variations

The calculus of variations is concerned with finding functions that extremise (maximise or minimise) a particular quantity. A classic example is the catenary problem. What shape does a chain take when hung between two points? It is the unique shape that minimises the potential energy of the chain; and such a shape is called a catenary, and is given by the cosh function. The idea that the potential energy of a hanging chain should be minimised is a variational principle. Another example of a variational principle is the notion that a soap bubble or water balloon should have a shape that has a minimal surface area, namely a sphere. The variational principles in both examples predict the same shapes as those that one would find by constructing force-balance arguments on line or surface elements, but the variational formulations are far simpler to describe and implement.

The idea that theories might be summarised by neat variational principles had been proposed since antiquity. Such theories are aesthetically pleasing in their simplicity, and in line with the principle of parsimony (or Occam’s razor).

However, there is a major difference between the above examples, and the principle of least action. In the above problems, the independent variables are spatial, and are concerned with a steady state. The principle of least action, which concerns the evolution of particle motions with respect to time, appears to require knowledge about the future. This is metaphysically troubling even today.

Optics and Fermat’s principle

In the early 1600s, a number of scientists, including Willebrord Snellius in 1621, independently discovered an empirical relationship between the angles of incidence and refraction when a beam of light passes through a boundary of different materials, which we now know as Snell’s law. In a 1662 letter, Pierre de Fermat showed that, under certain assumptions about the speed of light in different media, then Snell’s law implies that the path taken by a ray between two given points is that of minimal travel time, and conversely, a ray that takes a path of minimal travel time obeys Snell’s law at the interface. Fermat’s argument, however, assumes that light travels slower in more dense media. We now know this to be true, but actual experimental evidence that light in vacuo travels at a finite speed was not available until 1676.

Fermat’s principle of minimal time was criticised by the prevalent Cartesian school on two grounds. Firstly, the above assumption about the speed of light was unjustified, and not compatible with René Descartes’ notions that that the speed of light in vacuo is infinite, and higher in dense media. (These are not necessarily contradictory statements: the mathematical machinery for comparing infinite or infinitesimal quantities was concurrently being developed, although Newton’s Principia was not yet published and the calculus would not be formalised for another century or two.) A more fundamental criticism of Fermat’s principle was that it is teleological: why does light ‘choose’ to take a time-minimising path, and ‘know’ how to find such a path in advance? Why should it ‘choose’ to minimise travel time and not some other quantity such as distance (which would give a straight line)? Claude Clerselier, a Cartesian critic of Fermat, wrote in reply:

… The principle which you take as the basis for your proof, namely that Nature always acts by using the simplest and shortest paths, is merely a moral, and not a physical one. It is not, and cannot be, the cause of any effect in Nature.

In other words, although Fermat’s principle was mathematically equivalent to Snell’s law, and supported by experiment, it was not considered a satisfactory description of a physical basis behind Snell’s law, as no physical mechanism had been offered.

Particle mechanics and the principle of least action

Newton’s Principia was published in in 1687. After some initial controversy of their own, Newton’s ideas had become accepted by the time of Maupertuis and Euler. Newton’s formulation of particle mechanics, including the law of motion F = ma and the inverse square law for gravitation, gives a mathematical foundation for Kepler’s (empirical) laws of planetary motion.

An important development came in the 1740s with the development of the principle of least action by Pierre Louis Maupertuis and Leonhard Euler. Maupertuis defined action S as an ‘amount of motion’: for a single particle, action is momentum mv multiplied by the distance s travelled; for constant speed, s = vt, so the action is S = mv2t. In the absence of a potential, this matches our modern definition of action, up to a factor of 2. (Maupertuis referred to the quantity mv2 as the vis viva, or ‘living force’, of the particle.) Studying the velocities of two colliding bodies before and after collision, Maupertuis showed that the law of conservation of momentum (by now well-established) is equivalent to the statement that the final velocities are such that the action of this process is minimised.

Euler is generally credited with inventing the calculus of variations in an early form, applying it to studying particle trajectories. (The modern form was later developed by Lagrange, his student, in 1755.) Euler generalised Maupertuis’ definition of action into the modern action integral, and included a new term for potential energy. He showed in 1744 that a particle subject to a central force (such as planetary motion) takes a path (calculated by Newton) that extremises this action, and vice-versa. Lagrange later showed more generally that the principle of least action is mathematically equivalent to Newton’s laws.

But why is this a sensible definition of action? In fact, what is action?

Maupertuis’ reasoning was that ‘Nature is thrifty in all its actions’, positing that action is a sort of ‘effort’. He was happy to attribute the principle of least action as some sort of God trying to minimise the effort of motions in the  universe. But how does one know to choose this definition of action and not some other? As for refraction, why does one minimise travel time and not distance? Maupertuis argues that one cannot know to begin with, but that the correct functional needs to be identified.

Fermat and Euler took a rather weaker view, and refuse to make any metaphysical interpretations about their variational principles. Fermat stated that his principle is ‘a mathematical regularity from which the empirically correct law can be derived’ (Sklar 2012): this is an aesthetic statement about the theory, but says nothing about its origins.

Why do we find the principle of least action problematic?

Everyone agrees that the principle of least action is mathematically equivalent to Newton’s laws of motion, and both have equivalent status when compared against experiments. However, Newton’s laws are specified as differential equations with initial values (‘start in this state, and forward-march in time, with no memory about your past and no information about your future’). In contrast, the principle of least action is formulated as a boundary value problem (‘get from A to B in time T, accumulating as little action as possible’), governed by the Euler–Lagrange equations. Why are we less comfortable with the latter?

One reason is the question: Given that we are at the initial position A, how can we know that we will be at B after time T? This can be resolved by realising that when we solve the Euler–Lagrange equations, we have not been told what the initial velocity is, and have the freedom to choose it such that the final position will be B. Thus, one can convert between an IVP and a BVP: this is the approach taken with the shooting method for solving BVP numerically.

Another reason perhaps is cultural: most of us are taught Newtonian physics before Lagrangian physics. This is paedagogically reasonable: the Newtonian formulation requires far less mathematical machinery. There is also a technical reason for feeling more comfortable with describing physics through an IVP than a BVP: according to the Picard–Lindelöf theorem, an IVP is guaranteed to have a unique solution, at least for a finite domain; a similar guarantee cannot be made for a BVP.

Acknowledgements

The above essay has been guided by Lawrence Sklar’s book, Philosophy and the Foundations of Dynamics.

The police as the modern priest-Levite

Last weekend, I was cycling home at around 1am and was going through the East Road–Newmarket Road–Elizabeth Way roundabout, when I went past a woman who looked like she was waiting for a lift. However, the side of a major road away from any houses seemed like an odd place to wait for a lift, and she seemed distressed, so I stopped and asked if she was okay.

She was not okay. She was not waiting for a lift. She had been walking around Cambridge, lost, for the last two and a half hours. She was from a different city, having been on a hen night, having split off from the rest of the group, and trying to find her way back to the Travelodge. Her phone battery had run out. She was dressed up for going out; her shoes were not suitable for that much walking, and the cold weather was becoming increasingly unpleasant. (As it happened, the Travelodge was just down the road, about two minutes’ walk away.)

She was grateful, and told me that she had tried to stop and ask multiple people for directions, eliciting only ‘I don’t know, sorry’, rudeness, or abuse. Most horrifyingly, she had approached police officers for help and directions, but the police refused to help her, on the grounds that she was not a victim of a crime and therefore not in trouble.

Is their job to uphold the Law and to stop crime, or to serve the public and keep them safe?

Chinese proverbs

I’ve noticed an annoying and persistent tendency for people to inaccurately claim that certain sayings are Chinese proverbs. ‘Give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime’ is one example of such. ‘A picture is worth a thousand words’ is another.

These are admittedly only a couple of examples, so I may be going a bit far, but nonetheless, I claim that the following proverbs are true:

  • C0: For any proverb P, ‘P is a Chinese proverb’ is a proverb.
  • C1: For any proverb P, P is not a Chinese proverb if and only if P is claimed to be a Chinese proverb.

Since it is unnecessary for the Chinese to claim that a statement is a Chinese proverb (we need merely claim it to be a proverb), I make also the following claim:

  • C2: For any proverb P, ‘P is a Chinese proverb’ is not a Chinese proverb.

Can these claims be consistent, and which (if any) can I consistently claim to be Chinese proverbs?

Addendum: Oftentimes, the claim that a proverb is Chinese is used by orientalist woo-peddlers to create credence for their claims. Allow me therefore to go so far as to claim:

  • C3: For any proverb P, if P is claimed to be a Chinese proverb then P is false.

Is this consistent?

Confucianism in Harry Potter

I didn’t notice this at first, but one of my friends pointed out that most of the wizarding labour in the Harry Potter universe seemed to be employed by one of two employers. As a graduate of Hogwarts, you could respectably become a teacher at Hogwarts, or a civil servant of some description in the Ministry of Magic. Or you could leave the wizarding world and live a low-key existence amongst the Muggles. Appointments to either Hogwarts or the Ministry of Magic are conditional on you performing exceptionally well in a number of exams.

It then hit me that the Harry Potter world is actually an implementation of Confucius’ vision of society, complete with all the flaws in such a system!

The bureaucracy of the Ministry of Magic is sprawling and has an almost totalitarian (but not necessarily adversarial) influence over wizarding life. The same people constitute the executive, legislative and judiciary branches, with no separation of powers. There is only a very small private sector, and the state does not practise outsourcing.

Entry into the wizarding world is in theory open to all that display magical abilities, but in practice such abilities run mostly down bloodlines and there are relatively few Muggle-borns. While Muggle-borns are no less talented than their pure-blood colleagues, they nonetheless face either explicit hostility, or subtle prejudice. Such prejudice is common within the pure-blood aristocracy, with dissenting voices being rare and limited to liberals such as the Weasley House, who have some, but not much, political influence.

The Ministry of Magic is mostly concerned with policing the activities of wizards, and is uninterested in the Muggle world, for the most part desiring neither to improve nor oppress the latter. Like the Party in Nineteen Eighty-Four, the Ministry is concerned with staying in power and focuses its efforts on fighting potential rivals such as Albus Dumbledore, rather than effectively addressing the evils of society.

Soliciting donations

I came across a couple of Buddhist monks who have been going up and down Denver’s 16th Street Mall all day today and yesterday. The monks were approaching people, insistently soliciting (cash) donations towards the construction of a temple in the city. Rather large donations, too: they suggested $20.

It is only through alms that the Buddhist community can survive — the alternative would be theocracy. But the proactive approach of the monks makes me very uncomfortable, especially in a community where Buddhists are in a very small minority. While dāna (charity) is an important part of Buddhist practice, there is no reason to impose this on anybody else. (Confession: I didn’t give.)

APS DFD 2017

I’ve spent the last few days in Denver, Colorado, where I attended the American Physical Society’s annual fluid dynamics conference and am staying for a few more days. I’ve been staying at an excellent hostel, with very welcoming staff who even organised what was my first Thanksgiving dinner. The usual stresses of travelling and conference preparation aside, this has been a very enjoyable trip.

Theft

My rucksack was stolen in a cafe at St Pancras station on Saturday afternoon, literally from under my feet. The bag contained my laptop and my passport, which is particularly annoying since I was meant to fly to a conference on Monday morning. My travel plans are in disarray, although, hopefully, an emergency passport can be issued; and I had to shell out on a new laptop (not to mention a new rucksack, and new stationery, as the bag also contained pens that I was rather fond of). Expensive as it may be, passports and laptops may be replaced, and fortunately most of my work was backed up (only one day’s work was lost). I’ve never felt particularly attached to a passport: it is after all just a tool, albeit a very useful one, and one with a shorter lifespan than that of most working animals.

The same will not be true of a SD card on the laptop, which contains photos from my time as an undergraduate: irreplaceable memories. Nor is it possible to replace the notebook full of painstakingly handwritten notes, taken at various lectures and conferences. And perhaps the saddest realisation is that these things — an SD card that’s falling apart, a diary, a collection of scribbles, a half-composed piece of music, a draft of a paper — are worthless to him, and, 36 hours on, he has probably thrown them away.

I cannot know his motives for stealing bags. Perhaps he is in financial difficulties and needs to make some quick cash? Perhaps he is in danger of eviction, or worse, if he does not settle a debt? Or perhaps he was just greedy? In any case, the thief did not realise, or didn’t care about, the inconvenience and loss that he has inflicted on me, which does not translate into gain for him. He saw me as merely a victim, or a donor, a means to an end.

What’s so bad about that?

I actually saw the thief as he sat down near me, some minutes before he made away with my bag. I didn’t pay attention to him: I glanced at him, even made brief eye contact, then went back to talking with my friends and drinking my coffee. No smile, no recognition. I paid so little attention that I couldn’t remember anything about his face. I can’t say if he had glasses, or facial hair, or what clothes he was wearing. (Fortunately, a CCTV camera got a glimpse of him, and perhaps the police will be able to collate enough shots of him to identify him, although I don’t have high hopes of getting my stuff back.)

Did I see the humanity in him? Did I see him just as anything more than an unimportant background character?

He likely knows my name, if he’s looked at my passport, or anything else in the bag. Do I know him as anything other than ‘the thief’?

On nonviolence and the alt-right

The actions of the driver at the Charlottesville alt-right rally, who killed one person and injured another nineteen, are indefensible. So too are the views of the alt-right in general. But everybody trying to oppose them must be very careful not to sink to their level and use their tactics.

One repulsive practice of the alt-right is doxing, the releasing of private or sensitive information about a victim. When a victim’s address is published, they may receive death threats or even actual attempts on their lives. The alt-right has rather systematically done this to public figures such as Anita Sarkeesian, as well as otherwise private individuals. It is disappointing to see some anti-fascists going through videos and photos from the Charlottesville rally and asking on Twitter for the people there to be identified and publicly shamed. There are at least two problems with this. Firstly, the Universal Declaration of Human Rights is very clear on this:

No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.

‘Everyone’ includes people who hold abhorrent views, and simply attending a march – even a neo-Nazi one – is not justification for releasing somebody’s home address, work address or contact details. Secondly, identifying people from grainy photos is tricky and leaves room for plenty of false positives. There will be innocent people who look like someone at the rally, and who live nearby, who will be falsely identified as a Nazi, and shamed.

The same goes for physical aggression (which can be distinguished from self-defence in response to an attack). Of course punching somebody is far less serious than running them over, but that doesn’t make it easier to justify.

To be clear, the above is not a defence of the alt-right’s freedom of speech. Regardless of legal disputes over whether freedom of speech covers hate speech, it is a different matter as to whether violence against them is legal, morally defensible, or pragmatic.

On the other side of the Internet, other people can become dehumanised, and it is unsurprising that the hateful views of the alt-right have brewed over many years on the anonymised image boards of 4chan. But we must remember that:

When we oppose the alt-right, we must do our best to keep on the moral high ground.

Otherwise, we create a moral equivalence and help to justify Donald Trump’s notion that ‘both sides‘ are to blame.

In other words, violent actions such as doxing and assault are ineffectual even on a pragmatic level, even if you don’t believe in nonviolence as a moral absolute. And, just as we expect Donald Trump to disavow his hateful supporters, we should call out our own allies who do use these tactics. Their actions are as important as their words and their beliefs.

In defence of anecdotal evidence

Anecdotal evidence is worthless, right? It comes about through uncontrolled conditions, and the people reporting it may report selectively (whether or not they intend to be biased). Thanks in part to the works of writers such as Richard Dawkins, we have learnt to dismiss anecdotes and personal testimonials, bringing us closer towards a world governed by Reason and statistics. And we can consign anything supported merely by anecdotes to fire. Hurrah!

For many things in the natural world, it is relatively straightforward (if expensive) to isolate the thing to be tested, conduct experiments or controlled trials, and then quantify the effect of that thing, with well-defined error bars. There are well-established principles and procedures for designing clinical trials, which is why we can resolutely label things like homeopathy, claims about the MMR vaccine, and everything Deepak Chopra says as bullshit, even if there are occasional success stories.

But – and perhaps Dawkins and co. haven’t realised this yet – humans are complicated, and social phenomena, which involve multiple humans, are very complicated. It is impossible to control the environment in which they arise. Also, individual experiences are unique, and it is difficult to give meaningful definitions or boundaries (see also this post), and to ensure that everybody uses the same definition. Related to this, people do not always accurately report their experiences: something that is perceived to be ‘shameful’ will be underreported even if it is actually quite common.

For these reasons, many social phenomena have not been studied quantitatively. But

Absence of evidence is not evidence of absence.
 

If anything, it is evidence that you haven’t yet done a good enough job collecting evidence on the subject.

When nothing else is available, and when it is not possible to conduct a systematic, controlled and quantitative study, then anecdotal evidence is the best you can do, and it needs to be taken into account, provided it comes from a credible reporter, who has no vested interests. And you must hear that evidence, even if you do not give it much weight.

In more technical language, I am arguing that probabilities are subjective measures of a degree of belief, not objective, and that any evidence should update your posterior probability, even if not by very much.

What I’ve said so far has been relatively abstract, but a failure to understand this has truly harmful effects when we dismiss anecdotal evidence. When hundreds of people report that they have been victims of something, then we need to start taking their testimonials seriously.

The Everyday Sexism Project has collected reports from tens of thousands of women about the sexist abuses that they have suffered. These are idiosyncratic and can’t be categorised; they might have happened repeatedly over a long time, or be one-off events. These acts are often not visible: even the person doing or saying the sexist things might not realise that they are being hostile. An individual claim of sexism might be dismissed by suggesting a variety of mitigating circumstances, or even by assuming bad faith on the part of the reporter! But what is more likely: that misogyny exists in our society, or that thousands of women have conspired together to make up that myth? (You may find Occam’s razor useful.)

Everyday sexism is just one example of microaggression, which also happens in other contexts such as race and religion. Moreover the fear of being subject to a racist attack is just as relevant as the number of actual attacks. Fear has a chilling effect on society, and has a measurable effect on the economy, but by its very nature it is difficult to measure.

Other examples include people’s testimonials of an NHS (or other public service) that is unable to provide a good experience. When thousands of people across the country complain about this, then it is no longer an egotistic individual or a problematic local service; there is something nationwide happening.

In summary,


When thousands of anecdotes are given, then it is no longer "merely" anecdotal evidence.

As with the etymological fallacy, the failure to give anecdotal evidence the weight that it sometimes deserves is a dangerous fallacy, because it is easy to commit it, thinking that you are rational and your opponent is not. This arrogant attitude poisons a discussion.