It should be remembered that just as the Declaration of Independence promises the pursuit of happiness rather than happiness itself, so the iterative scientific model building process offers only the pursuit of the perfect model. For even when we feel we have carried the model building process to a conclusion some new initiative may make further improvement possible. Fortunately to be useful a model does not have to be perfect.
George E. Box (1979)
In the book “Justice: What’s The Right Thing to Do?” (Sandel 2010) Michael Sandel aims to help us answer questions about how to do the right thing by giving some context and background in moral philosophy. Sandel is a philosopher based at Harvard University who is reknowned for his popular treatments of the subject. He starts by illustrating decision making through the ‘trolley’ problem.
Trolley problems seem to be a mainstay of moral philosophy: in the original variant (Foot 1967) there is a runaway trolley1 rolling at speed down a track, it is approaching a set of points beyond which is a group of five workers. The workers do not have time to get out of the way of the trolley. They will be killed. You have the opportunity to switch the track, saving the five workers, but the trolley will run onto another track, killing a single worker. You could shout a warning, but somehow the workers wouldn’t hear you, or if they did hear they wouldn’t have time to get out of the way. The moral question is should you switch the track? You would kill one worker, but save five. Apparently most of us think that in that situation we’d pull the switch.2
Sandel starts his book by looking at utilitarianism. Utilitarianism says that we should make decisions that lead to the the greatest benefit to humanity. Under utility theory, Sandel explains, we pull the lever because we expect fewer people will be unhappy when one person dies than when five people die. Of course we can debate that, and we will come back to that in a moment.
Utility Theory and Utilitarianism
In moral philosophy utilitarianism is the idea that when considering a choice of actions we should choose the one that brings the most benefit to the population.
Utilitarianism is a philosophical variation of utility theory. It is due to Jeremy Bentham and John Stuart Mill. For their time the ideas in utilitarianism are very advanced. But it needs to be borne in mind that that time was rather more limited mathematically and scientifically than we are today. Jeremy Bentham predates Laplace and was born only 15 years after Newton. In Jeremy Bentham’s time differential calculus was only taught in advanced degrees in the most sophisticated universities. Probability was only just beginning to be understood.
Utilitarianism is an example of a philosophy that suggests that the result of a decision can be evaluated mathematically. By quantifying the benefits of the decision and weighing them against the downsides, the idea is that decision making can be rendered mathematical. Such mathematical functions are known as utility functions. The basic principle is that we can encode the good and the bad in a mathematical formula.
Since Bentham and Stuart Mill, the idea of framing our actions by formulating them mathematically has become inextricably interlinked with the domain of mathematical modelling. One way we have of quantifying the value of a mathematical model is its ‘goodness of fit’. In particular, we validate our models by evaluating how well the model does at predicting a quantity of interest. Sometimes that prediction can be associated with a direct monetary gain (such as in a stock market) or sometimes that prediction is associated with an improvement in the quality or length of life (such as in the diagnosis of disease).
Sensitivity and Specificity
Consider a model that tries to predict, on the basis of a test, whether an individual has cancer or not. Most, if not all, clinical tests are unreliable, i.e. a positive test does not always mean that the disease is present. The portion of positive tests where disease are present is known as the sensitivity of the test. It is a count of those people that are correctly diagnosed with the disease divided by the number who really have the disease. A test with 50% sensitivity will diagnose 50% of the people who really have the disease with the disease.
You might think it is a good thing to have a highly sensitive test, and you’d be right. But actually it is easy to increase the sensitivity of a test. Imagine a lazy lab technician, who rather than carrying out the test, just answers ‘disease present’ for every sample sent. The test will now have 100% sensitivity! Unfortunately all those who don’t have disease will also be categorised as having the disease. The technician has rendered the test non-specific. To account for a test that is weak in this way we also measure the quality of a test by its specificity: the number of patients who don’t have the disease that are correctly identified. For the lazy technician the specificity is 0%.
For clinical tests there is a trade off between sensitivity and specificity. A test that is highly sensitive (captures all the diseased population) will often be non specific (it will incorrectly diagnose a large portion of the non-diseased population).
When considering which tests to use we necessarily have to decide what the effect of wrong decisions is on indvidual people. If a test isn’t very sensitive, then patients will be given the all clear, when they actually have a disease. But if a test is highly sensitive but non-specific, many patients will be faced with unnecessary worry that they have the disease when none is present. The number of people involved will also depend on the base rates of the disease in the population. There will also be monetary cost associated with processing the tests and those who have a positive result.
We want to know the utility of the test, and Utility Theory is one approach we have to assessing that. If we make a decision that a particular sensitivity and specificity is acceptable for a test, if we make judgments of the numbers of worrying and unnecessary trips to the doctors that patients must endure then we are placing value on these aspects of life.
To make a single decision we must weigh up these different aspects against one another. We can then decide which outcomes we value the most. Even if we don’t write it all down explicitly, by our actions we can see that we are making decisions that define a utility function: a mathematical function that weights how we value the competing factors. Utility functions are so common that they receive many different names: objective function, cost function, error function, fitness function, risk function.3 The motivation for any given utility function can be different but the end result is the same, a mathematical formula by which we can quantify the relative value of a set of predictions or decisions.
The more we desire accountability for our decisions, the more that we require that they are rationalised, the greater our tendency to be explicit about the mathematical form of our utility function. The more explicit we become the easier it is to see how much value we are associating with aspects of our lives that we might instinctively feel cannot be priced. Aspects that cannot be bought or sold: our health or our life itself.
How can we reconcile this drive for accountability with our natural instinct that we should not be placing a price on such things?
The first thing we have to do is acknowledge the application of ideas about utility is much more sophisticated than is sometimes acknowledged in popular treatments of the subject. In Sandel’s book he criticises utilitarianism by framing it in the context of the historical arguments given in a specific era. But he doesn’t place those writings in context. The utility functions of Jeremy Bentham and John Stuart Mill provide a nascent theory, but there are a number of ways in which those ideas can be updated given the knowledge we have gained in the intervening 250 years.
The Push and the Trolley
In his book, Sandel follows up his initial trolley example with a more complex one.4 This time you are on a bridge. There is a rail line going under the bridge and there are workers on the line. This time there is a trolley going towards the five workers on the far side of the bridge. The workers will be killed by the runaway trolley. There is a man on the bridge with you, but there is nothing else to hand. As before you could shout, but the workers can’t get out of the way in time.
Apparently your only opportunity is to push the other man off the bridge onto the rails, thereby deflecting the trolley and saving the five men working on the line. You could jump onto the line yourself to deflect the trolley, but the man on the bridge is heavier than you, and you would be too light to deflect the trolley. Your only chance is to push the other man, the ‘fat man’, off.
It seems most people would choose not to push the fat man off. Sandel finds this difficult to explain from the perspective of utilitarianism and becomes dismissive of utilitarianism as a result.
We may be doing Sandel an injustice but it seems there is a significant over simplification in this scenario. In an attempt to construct a counter example to illustrate a point, there is a key aspect to the second example, which is not so present in the first. It’s an aspect that humans are faced with every day, but we are not entirely sure how we deal with it. It is a domain where humans can outperform machines, that aspect is uncertainty.
In an effort to define a situation that primes us for decision making the trolley scenarios tell stories. The story is far more complex in the second example than the first. We are unable to help but imagine the scenario. We might picture that the bridge is a viaduct. We might picture that it is made of sandstone. We might imagine that the fat man is wearing a blue shirt. We can picture what it means to push him off. One of Sandel’s arguments is that the difference between the two scenarios is that people are unable to kill the a man through direct interaction, and this aspect isn’t explained by utilitarianism. That is certainly a plausible explanation, but we don’t have to leave utilitarianism behind so quickly.
Evolution and Utilitarianism
Bentham was not aware of Darwin’s principle of natural selection, he predates Darwin. Bentham believed that we should take an action if it maximised the happiness of the population (thereby minimising pain). So his utility function was the sum of the happiness of the population. For its time this is an interesting concept, but Bentham’s disciple, John Stuart Mill, struggled with the idea that a night of debauchery was the same form of happiness as, for example, staying in and reading a good book.5 He argued that these different happinesses should not be valued the same. This is a clear flaw in the idea of a single utility function which Bentham had proposed. However, we should probably be allowed to update Bentham’s ideas a little in the light of what we’ve discovered since.
Darwin’s idea was that species evolve through natural selection. Natural selection is a relatively simple concept, but it has some complex consequences. Natural selection just suggests that successful strategies will prevail. This is a somewhat self-verifying statement. The measure of success is that the strategy did prevail. However, there are some complex consequences. For a species to prevail it needs to survive. Natural selection forces us to think about why our behaviour might be helpful in determining our survival. Bentham’s idea was that we should maximize happiness. But given knowledge of Darwin, it may be that we’d prefer to identify happiness as an intermediate reward. Natural selection implies that one of our longer term goals is the survival of our species, with intermediat implications for our selves, our societies and our ways of life.
Most of us wouldn’t accept that our happiness at any precise moment is an absolute assessment of where we feel we are in life. In fact, we often find our state of happiness much more sensitive to changes in our circumstances than any particular absolute measurement we can make. If our circumstances are good in the absolute sense, but not improving, then we may have to consciously remind ourselves of how lucky we are to gain pleasure from our position.
In the children’s book “A Squash and a Squeeze” (Donaldson and Scheffler 2004) illustrates this idea nicely with a rendering of an old folk tale. An old lady lives in a house which she finds too small. She asks the advice of a “wise old man” who suggests she adds a chicken, a pig, a goat and a cow to her living quarters. The lady does as instructed finding her accommodation increasingly cramped as she does so. Finally the man tells the lady to let all the animals out again. The lady is now happy, finding her house to be far more spacious without the animals. She is happy at that moment, despite her absolute circumstances not having changed since she first went to the wise old man for advise. This folk tale may not provide a practical solution to the housing crisis: its moral is a caricature of our sensibilities: our happiness is more sensitive to our changes of circumstance than our absolute positioning. From an evolutionary perspective this makes sense. If as organisms we became satiated at a particular stage of achievement then our species or societies could become complacent.
The mathematical foundations of the study of change are given by Newton and Leibniz‘s work differential calculus. In Bentham’s time these ideas were taught only in the most advanced University courses. The argument above suggests that actually happiness is some (monotonic) function of the gradient of whatever personal utility function we have. Not the absolute value. This idea also may deal nicely with the issue of different types of happiness. A night of debauchery may make us instantaneously very happy, implying a high rate of change. But it is over quickly, implying that the absolute change in our circumstances is small (the absolute improvement in the utility would be the level of happiness multiplied by the time we were happy for). Of course this improvement may be offset by whatever the consequences of the debauchery are the next day. John Stuart Mill’s variation on utilitarianism considered ’higher pleasures’, such as the pleasure gained from literature and learning. See also Eleni Vasilaki’s perspective on Epicurus (Vasilaki 2017). Instantaneously we may experience less happiness when engaged in these activities compared to whatever our night of debauchery involved. However, these activities can be sustained for a very long period and allow us to achieve more. The absoute improvement in our circumstance would be given by the period of study multiplied by the pleasure given.
In the terminology of differential calculus, our happiness must be integrated to form our utility. Not instantaneously measured. It may be that Stuart Mill’s differentiating of happiness could have been reconciled with utility theory by realising he was actually distinguishing between sustainable forms of happiness and non-sustainable. Unfortunately we can’t revisit him to ask, but we can at least do him the justice of giving him the benefit of the doubt.
Daniel Kahneman’s Nobel Memorial Prize in Economics was awarded for the idea of prospect theory. Kahneman describes the theory and its background in his book, “Thinking Fast and Slow” (Kahneman 2011). Prospect theory is not a mathematical theory, but a theory in behavioral economics based on empirical observations of human behavior. Empirical observation about how we value different alternatives that involve risk. A key observation of Kahneman’s is in line with our analysis above: people are responsive to change in circumstance, not absolute circumstance. Prospect theory goes on to identify asymmetries in our sensitivities. A negative change in circumstance weights upon us greater than the equivalent positive change in circumstance.
Bentham’s ideas focussed around the idea of a global utility, maximisation of happiness across the population. Darwin’s principle of natural selection actually insists that there must be variation in the population, and therefore variation in our perception of our circumstances.
Natural selection relies on variation, because if there is no variation, then there can be no separation between effective and ineffective strategies. Different strategies arise from different value systems. If all organisms were to pursue the same strategy and when the circumstantial judgment of the selection process would fall upon all members of the species simultaneously, they would die or survive together.
A Cognitive Bias towards Variance
One of the themes that Kahneman explores is the tendency of humans to produce, through their System 2 thought processes (their ‘slow thinking brains’), overcomplicated explanations of observed data. There’s a tendency for people to focus on a detailed narrative as if it was pre-determined and within the control of all participants. In practice we cannot control events in such a regulated manner.
To predict we need data, a model and computation. Data is the information we are given, the model is our belief in the way the world works and computation is required to assimilate the two. This is true for humans and computers. Ignoring the quality of the data for a moment, and focussing on the model, our predictive system can fail in one of two ways. It can either over simplify or it can over complicate.
This binary choice may seem obvious, but it has some significant consequences. The phenomenon was studied in machine learning by Geman et al (Geman, Bienenstock, and Doursat 1992) who referred to it as the ‘bias variance dilemma’. They decomposed errors into those due to oversimplification (the bias error) and those due to insufficient data to underpin a complex model.
Bias errors are errors that arises when your model is not rich enough to capture all the nuances of the world around. Bias errors occur when the rich underlying phenomena underpinning an observation are ignored and a simpler model explanation is given. An example of a bias error would be one that arises from the simple model “home teams always win sports games”. There is some truth to the home advantage: this model will do better than 50/50 guessing. But it is an oversimplification. It is biased. However, because we have a lot of data about sports games, then two experts using this rule to predict outcome would make consistent predictions.
An error due to variance is one which occurs when we go too far the other way. There are a myriad of factors that could effect the outcome of a sports event. Weather, balloons on the pitch, the mental and physical fitness of each of the players. The quality of the pitch. If we take them all into account we might hope for better predictions. But in reality, we haven’t seen enough data to determine how each of these factors effects outcome. Badly determined parameters lead to high variance error. Two experts see slightly different data and weight these complex factors according to their perspective. As a result their predictions can vary, leading to variance error.
The important point is that these errors are fundamentally different in their characteristics. The error due to bias, the simplification error, comes about by not taking taking all the factors into account. In statistical models bias errors are very common: indeed they are often preferred because they are associated with simpler models and the parameters often have some explanatory power.
The type of error Kahneman is describing in human explanations would be termed an error due to variance. A variance error is different from a bias error. In a variance error you may have a model that is sufficient to describe the underlying system, but you don’t have enough data or information to pin down exactly how your observed outcome came to be. A characteristic of error due to variance is that different observers may have highly rich, but conflicting, explanations of what brought about the phenomenon. The soft of conflicting explanations that bring about lively debate in television studios during half-time breaks in football matches
The bias-variance dilemma is a major challenge in machine learning. One widely accepted solution to the dilemma is that we choose a model which exhibits larger bias because even though it is known to be incorrect (too simple) for the data w have available it will make better predictions. Many of Kahneman’s mechanism’s and solutions for human irrationality actually do introduce simple statistical models to improve the quality of prediction. Kahneman relates how decision making can be rendered more consistent and higher quality in this manner.
An alternative and widely used solution in machine learning is do develop large families of complex models that exhibit variance, just as individual humans do. However, once these models are trained they are not relied on individually but the are combined in ‘ensembles’ to predict together. They vote on the solution or their average prediction is taken. This idea is very similar to the ‘wisdom of the crowds’. Seen from this context there are very good reasons why a ‘population’ of intelligent beings should exhibit variance-error instead of bias-error. A characteristic of bias-error would be that we would all be consistent in our predictions. This is good for accounting for behaviour, but it is a serious problem if we are all consistently wrong. Variance-error implies that we all come up with different, over-complicated, reasons why events transpire as they did. Taken as a population we cover a wide range of alternatives. As a result we act in different ways, and make different decisions. It is clear that we will never be all correct, but when it comes to evolution, the important thing is that we are never all wrong.
Decision Making and Bias-Variance
A further advantage of choosing variance-error over bias-error for a population is that a consistent and robust prediction can always be achieved by averaging outcome. In machine learning approaches such as bagging and boosting (Breiman 1996) can be used to reduce variance-error in a population of models.6 Advocates of the “Wisdom of the Crowds” propose the same principle. By preferring bias-error in our population we have no recourse, but models that exhibit variance-error can always be combined to create a more stable prediction from the population as a whole: democratic decision making is one way to achieve this.
So the ‘rational’ behaviour of a population under natural selection is to sustain a variety of approaches to life. It follows then, that if there is to be natural selection within our species, our ideas of achievement should vary. Our individual utility should be subjective. What brings me happiness, may not bring you happiness. We probably have different ideas of debauchery, literature and learning. While we can disagree with each others tastes, natural selection tells us that our species is more robust if there if there is a diversity of approaches. Darwin’s principle tells us we should be like this.
When we see half-time football pundits debating their convolved explanations of the way the match is evolving we should remember that their arguments are all important and each one may have some validity. They are the result of overly complex models being applied on little data. The game of football is fundamentally stochastic, but the analysts treat it as deterministic.
This phenomenon is not new, and it is not constrained to football punditry. In 1954 the psychologist Peter Meehl wrote a book about how clinical experts can be outperformed by simple statistical models (Meehl 1954). Meehl suggested they ‘try to be clever and think outside the box’. Kahneman addresses this challenge in Chapter 21 of his book.
complexity may work in the odd case, but more often than not it reduces validity
going on to say
humans are incorrigibly inconsistent in making summary judgments of complex information. When asked to evaluate the same information twice, they frequently give different answers.
Unreliable judgements cannot be predictors of anything.
The two approaches Kahneman proposes for dealing with this in human society are:
- Replace human punditry making with simple statistical models. This also makes sense from a statistics point of view if we desire consistency we should replace the models that exhibit variance-error (the humans) with models that exhibit bias-error (simple statistical forumlae).
- Exploit wisdom of the crowds. Wisdom of the crowds is a proposal that human opinion should be aggregated to improve predictions. This is consistent with the idea that humans tend to make variance-errors.
Punditry is widespread, and the bias-variance analysis shows us that there are good reasons why, under natural selection, we should value such diversity of opinions. What is also important is that we should develop mechanisms in society for these opinions to be properly represented when making a decision.
This error is dangerous, it is one of the intellectual failings of those that sought to put ideas from eugenics into political practice. Early philosophies based on natural selection were overly focussed on the average value of the ‘fitness’ of a population. Trying to increase this value whilst simultaneously reducing variation is a very dangerous game. It is artificial selection. It assumes that you have preordained what the future natural circumstances are going to be. It may be OK for race horses, greyhounds, crops, sheep and cows because in those circumstances we are aiming to control their environment. It is not OK for the human race.
We are right to express moral outrage at what the negative eugenecists tried to achieve. But it was also motivated from a flawed understanding of science. Their model was wrong. Populations that are capable of excelling in a particular environment, because they are highly tuned to it, are rapidly extinguished when circumstances change. If the animals in a species become too specialised then they may not be able to respond to changing circumstances. Think of cheetahs and eagles vs rats and pigeons.
Socially the same principle should hold. I may not agree with many people’s subjective approach to life, I may even believe it to be severely sub-optimal. But I should not presume to know better, even if prior experience shows that my own ‘way of being’ is effective. Variation is vitally important for robustness. There may be future circumstances where my approaches fail utterly, and other ways of being are better.
A Universal Utility
The quality of our subjective utilities at any given time is measured by their effectiveness in the world. Survival of the species indicates that it is the sustenance of the entire human species that should concern us in the long run. Although there will be many intermediate effects that we will be looking to achieve in the medium term. Indeed, we may even question if there are circumstances under which we would not wish the human species to survive.7 The universal utility by which we are judged is therefore difficult to define. We can pin down aspects of it, but perhaps the best we can do is seek compromise between our individual utilities, while maintaining awareness that there may be outside forces, for example climate change, that will have such a detrimental effect on all our lives that it is worth investing significant time and effort as a society to try and reduce our exposure.
Lets get back to the trolleys.
The Real Ethical Dilemma
The trolley problem is an oversimplification, and one which is not useful in characterizing the moral dilemmas we are faced with in the modern era of computer decision making. Instead of the trolley problem, let’s propose a new dilemma, and let’s focus on driverless cars.
Most arguments for driverless cars focus on the overall reduction in human death rates we expect to result from their adoption. For example, if we introduce driverless cars and bring about a 90% reduction in deaths on the road, then that surely is a good thing. But what if the remaining 10% of deaths are focussed on a particular section of the population, for example, what if we find that the only people those cars do continue to kill are cyclists.
Now there are ethical and moral questions about what we have developed. Even if we have reduced the total number of cyclist deaths8 is it fair to disproportionately affect one section of the population? A simplistic utilitarian perspective would say yes, please proceed, although we individually might be uncomfortable with this. Let’s explore further.
Utilitarianism appears to be telling us to favour a set up that could disproportionately effect a minority: in this case cyclists. Let’s develop a more sophisticated view of the right utility and see how it might effect our conclusions.
Uncertainty: The Tyger that Burns Bright
There are two principles we should take into account when considering how we should aim to effect the evolution of our society. The first we have discussed above, uncertainty. The second is Darwin’s principle of natural selection.
Natural selection is a simple idea although it’s had a difficult history. Unfortunately, when applied on its own it leads to some unsophisticated principles that don’t work in practice. First of all, we might naively assume from natural selection that there is a si