The Trolley Problem Becomes a Reality: The Ethics of the Self-Driving Car

Originally posted on 03/12/17.


This paper explores the various approaches to machine ethics in the context of self-driving cars. The fairly recent introduction of these vehicles into the public sphere indicate the necessity of a morally sound structure from which we can define the rules and methods these self-driving cars will follow. This is done by studying the different challenges faced in the pursuit of this ethical framework, including the Trolley Problem and the dilemmas and problems it faces. These problems are taken into account when considering both top-down and bottom-up approaches, and the benefits of each are outlined and analyzed. An ideal solution — one that combines elements of both the top-down and bottom-up concepts — is proposed. Finally, methods are introduced to monitor the actions of these vehicles which serve to mitigate fear of autonomous vehicles and present a background for the logical reasoning behind their decisions. The existence of a consistent ethical framework for self-driving cars will allow this industry to develop in an morally righteous manner.

Keywords: Autonomous Vehicles, Self-Driving Cars, Machine Ethics, The Trolley Problem, Anthropocentrism, Deontological Ethics, Virtue Theory, Utilitarianism


Tegmark (2017 pg. 98–99) notes that car accidents claimed over 1.2 million lives globally in 2015. Isreali Computer Scientist Moshe Vardi became very emotional at a 2016 meeting of The Association for the Advancement of AI stating that not only could AI reduce the number of deaths, but it must, declaring it a moral imperative. Almost all car crashes are caused by human error and it is widely believed that up to 90% of road deaths could be prevented with the use of self-driving cars. This is one of the major incentives for getting self-driving cars on the road. The purpose of this essay is to determine an optimal ethical framework for the control of self-driving cars.

The impact of autonomy on industry and manufacturing has been significant. In many applications machines have replaced humans in performing the “three D’s” jobs that are dirty, dull and dangerous (Lin, P. et al. 2016, pg. 4). Robots have been active in the field of transportation for some time; witness autopilot controls and driverless trains, however their potential for colliding with other vehicles and humans has been minimal and the allowance for true autonomous decision making limited. With self-driving cars however, the interaction is continuous and complicated, and the possibility of accidents omnipresent. The cars’ operating system must be designed to make many complex decisions. The public will judge these vehicles more harshly than human drivers and the tolerance for error or accidents will be much lower. The vehicle must know what it “ought” to do in all cases, which requires a high level of reasoning. It requires a code of ethics. In this essay some of the challenges specific to instilling ethics in self-driving cars will be identified. Several different approaches to instilling ethics will be presented and evaluated vis a vis the car. An optimal ‘hybrid’ approach will be explored, exploring both the rigid rule-based method of programming as well as the organic learning-based method. Finally, methods for testing and monitoring the technology will be outlined.

The Challenges of Instilling Ethics in Self-Driving Cars

Current autonomous systems, according to Allen and Wallich (2014) are not capable of full moral reasoning. They cannot completely understand their choices, choose their behaviour freely, or detect ethically relevant features in their environment. These elements must be combined to create autonomous moral agents. With this moral agency, robots would be capable of similar decision-making to humans. Getting to this stage will be very challenging, given the complicated and abstract nature of ethics.

An older piece of research that demonstrates the complexity of robotic ethics while providing a context analogous to the self-driving car is the Trolley Problem. The Trolley Problem consists of a series of hypothetical scenarios developed by British philosopher Philippa Foot (1967 pg. 1395–1415). Each scenario presents an extreme situation that tests the subject’s ethical standing. The dilemma has been elaborated upon by several subsequent authors, including Judith Jarvis Thomson (1985), who divides it into various responses dependent on the ethical theories of Utilitarianism, Deontology, Divine Command Theory, Ethical Relativism, and Virtue Ethics. These theories will be introduced again as possible guidelines that can be programmed into self-driving cars in a top-down, rules-based approach. The Trolley Problem serves to demonstrate how difficult it is to program a self-driving car to do what it ‘ought’ to do.

Patrick Lin (2016) highlights discrimination as a major obstacle for programmers in the field of self-driving cars who might use a Trolley Problem-type algorithm. For example, a hypothetical scenario is raised in which the car has an option of hitting either an 80 year old grandmother or an 8 year old girl (pg. 69). For many, this would seem an obvious choice, as the 8 year old presumably has more of her life in front of her. However , the Institute of Electrical and Electronics Engineers (IEEE) prohibits discrimination of any persons based on “race, religion, gender, disability, age, national origin, sexual orientation, gender identity, or gender expression” (Lin 2016 p 70). The IEEE has over 430,000 members and publishes a number of scholarly journals. The organization is considered a major authority on the subject of autonomy. Deliberately programming in a choice that discriminates in any respect, such as the one above, would contravene it’s tenets. This is one reason why using the Trolley Problem as a template for instilling ethics into self-driving cars is problematic.

Noah Goodall (2016) weighs the value of using the Trolley Problem to define and determine ethics for self-driving cars. Its strengths are that it illustrates clearly areas of most ethical importance and serves as an ‘edge case’, an extreme situation in which vehicle response can be judged. The main weakness is over-simplification; the moral problems that an autonomous vehicle might face will be far more subtle and complex than the scenarios outlined in the famous thought experiment. Computers interpret instructions literally. An automated vehicle that treated every human life alike and equally might give more room on the road to a motorcyclist without a helmet than to another one wearing full protective equipment because the unprotected one would be less likely to survive an accident. This endangers the safety-conscious driver more. Another generalization is that of valuing pedestrian safety as categorically more important than that of any other road users; this approach can actually be much more dangerous in certain situations.

One final challenge to inculcating ethics highlighted by Mladenović, M. N., et al (2014) is the claim that letting any non-human authority make life and death decisions is a violation of fundamental human rights as recognized by the United Nations. They conclude, though, that there is a capacity for the development of control methods for an autonomous car as long as it has a strong anthropocentric focus; that is, one that places the human life above all else. As explained above, there are many challenges to establishing ethics in self-driving cars which must be taken into consideration when devising approaches to programming and training the vehicles operating systems.

Approaches to Developing Ethics in Self-Driving Cars

Currently there are two ways of programming a computer. The traditional method is a top-down, rules-based approach involving specific guidelines and/or laws. The more contemporary model is a bottom-up machine learning method which exposes the computer to a myriad of situations which allow it to optimize its behaviour. This seeks to imitate learning, developmental and evolutionary processes (Lin et al 2016 pg.59)

Tegmark (2017) summarizes major philosophical theories of ethics which could provide top-down-type guidelines. The oldest, derived by Aristotle, focused on virtues. In Virtue Theory, humans strive to be principled and of high moral character. Next, Emmanuel Kant’s theory of Kantianism is a form of Deontological Ethics (from the Greek deon, “duty”) which stresses duties and obeying rules under all circumstances. Finally, Utilitarians emphasize the greatest good for the greatest number. The only rule is to make the future as good as possible, or to search for the rules that produce the best future results. To date, no realistic guiding set of rules has ever been determined and therefore the top-down imposition of these three theories is impractical for programming self-driving cars.

Anderson and Anderson (2010) present an alternative theory devised by English philosophers Jeremy Bentham and John Stuart Mill. They contended that ethical decision making is a matter of performing “moral arithmetic.” Their theory of Hedonistic Act Utilitarianism states that the right action is the one likely to result in the greatest “net pleasure,” calculated by adding up units of pleasure and subtracting units of displeasure experienced by all those affected. The theory is not all-inclusive, but at least it demonstrates that a plausible ethical theory is, technically, computable. (pg.74)

Murphy, R., & Woods, D. D. (2009). Beyond Asimov: The Three Laws of Responsible Robotics. IEEE Intelligent Systems, 24(4), 19. doi:10.1109/mis.2009.69

There has been formulation of robotic laws dating back at least to Isaac Asimov’s debut as a science fiction writer in 1942. The Three Laws of Robotics which he published in a number of short stories and novels are still respected by the robot ethics research community today, and serve as a template for further study. Murphy and Woods (2009) made the modifications to Asimov’s laws as outlined in Table 1. It is conceivable that laws of this kind can be incorporated in the top-down programming of self-driving cars.

Gerdes and Thornton (2016, pg. 95) also reference Asimov’s famous rules of robotics, and although they concede that this is not a fitting solution for modern robot autonomy, it nonetheless brings up the important point of anthropocentric design. The authors (pg. 87) suggest that the evaluation of the algorithms that control autonomous vehicles will not be based on test results or statistics, but rather on the standards and ethics of the society in which they are operational. The authors seek to identify a path by which the ethical dilemmas raised by Lin (2016) and others, can be converted into a blueprint for controlling self-driving cars . The author claim that it is possible to relate these concepts to mathematical rules that a self-driving car’s computer can follow. Their method was to take a number of rules from deontological (rules-based) and consequentialist (outcome-based) ethics and apply them to Control Theory, the idea that entities refrain from deviant behaviour because of constraints or deterrents. This is a more elaborate top-down approach.

Ron Arkin (2009) created an ‘Ethical Adaptor’ model to emulate moral emotions and allow robots to learn from their mistakes. The researchers were able to model ‘guilt’ based on psychological theories of behaviour. The experience of guilt caused machines in a military context to alter their behaviour. This manipulation of emotion is reminiscent of Aristotle’s Virtue Theory. This approach would be inappropriate for a self-driving car however as something must go wrong before the robot adjusts it’s behaviour.

Parkin (2017) identifies an organization called GoodAI which seeks to train artificial intelligence in the field of ethics using a bottom-up approach involving machine learning, in contrast to the top-down rules-based approaches described above. Researchers and engineers program useful skills (problem solving heuristics) into a basic artificial intelligence. The AI has the potential to learn new skills and use them to improve its learning ability. Then apply this knowledge to new situations. A cognitive scientist at NYU and head of company called Geometric Intelligence agrees with this strategy, observing that top-down programming does not allow for subtle distinctions and abstractions, or changes in beliefs over time. GoodAI’s approach mirrors humanity’s. Humans learn what is ethically acceptable by watching how others behave, and, like humans, GoodAI introduces increasingly complex decisions to its computers over time. They can build on previously learned knowledge and get feedback from human and digital monitors. In a similar vein, AI at the Entertainment Intelligence Lab are read thousands of stories. This allows them to develop an averaged out response to different situations. Bottom-up machine learning is a more useful way of instructing the operating system of a self-driving car. One drawback of this approach is the potential to adopt bad behaviours as well as good.

An Optimal Approach to Developing Ethics in Self-Driving Cars

The robots used for self-driving cars will require rules and characteristics that are top-down programmable and bottom-up machine learnable. Utilitarianism and Virtue Theory top-down approaches are the most feasible, with Virtue Theory appearing to be the best. As Patrick Lin et al. (2016) writes; “Virtues constitute a hybrid between top-down and bottom-up approaches, in that the virtues themselves can be explicitly described but their acquisition as moral character traits seems essentially to be a bottom-up process” (pg. 59). Anderson and Anderson (2010) point out that determining ethical boundaries for behaviour in cases such as the self-driving car is an easier task than trying to devise universal rules of ethical and unethical behavior, which is what the ethical theories described above attempt to do. When given a description of a particular situation within which the robots are likely to function, most ethicists would agree on what is ethically permissible and what is not i.e. crashing. (pg.75) The authors used machine learning to install a representative number of cases in which humans have determined certain decisions to be ethically correct, based on ratings such as how much good an action would result in, how much harm it would prevent, and a measure of fairness. The AI would then abstract a general principle that can be applied to new situations.

One possible source of specific car-crash machine learning data could be an on-line platform such as the MIT Moral Machine (Bonnefon, J. et al. 2016) in which a multitude of crash scenarios are generated and information is gathered on the decisions people make between two destructive outcomes. The scenarios are a variation on the Trolley Problem. Information from the exercise could be collected and employed to help find solutions to critical self-driving car problems. Real-life self-driving car crashes can also provide useful data. Google claims that all of its self-driving car accidents so far have been due to human error, mostly human drivers not paying enough attention to the road (Parry, 2015). Google’s self-driving cars have covered more than two million miles during tests across the US. There have been only 16 reported accidents — none fatal. Goodall (2016, pg. 814) highlights a specific Google patented form of risk management which defines a risk as “the magnitude of misfortune associated with the feared event multiplied by its likelihood.” Each potential outcome is given a likelihood as well as a positive or negative magnitude (either a benefit or a cost). Each event’s magnitude is multiplied by its likelihood, and the resulting numbers are summed. If the benefits outweigh the costs by a reasonable amount, the vehicle will undertake the action being considered. This algorithm is a practical crash avoidance strategy that can be programmed into self-driving cars.

Noah Goodall (2014) also examines unavoidable self-driving car crashes and proposes a third step to the hybridization. The stages are as follows: create a universal, rationalized system involving deontology and/or consequentialism for self-driving cars, utilize machine learning to study human reactions to real and simulated crashes, and require the cars to communicate its decisions so humans may understand the reasoning behind it (pg. 63). The cumulative data from the last step could be employed in machine-learning of other self-driving cars and their operating systems. This would allow for a widely configurable and easily understood rationale behind robot vehicle ethics. A comprehensive, hybrid approach as outlined above would be the most effective way to instill ethics into self-driving cars.

Testing and Monitoring

Roman Krzanowski and Kamil Trombik (2017) propose the idea of a standardized Machine Ethics Test, or MET, which would verify whether a machine is ethically positive or negative, and responsible for its actions. It does not, the authors clarify, declare a machine as being as morally upstanding as human, but rather shows that the autonomous agent is safe to be around and its intention to do harm is limited to as narrow a margin as humanly possible. It is based on the Turing Test, an early AI test of a machine’s ability to demonstrate intelligent behaviour equivalent to that of a human. The test has three stages: firstly, a qualifying verification on basic morality, then a series of situational tests split up into controlled, open-ended, and staged, and, finally ,a period of apprenticeship in which an autonomous agent would be strictly supervised by a human under real life conditions.

Gogoll and Müller (2016) further study this idea of a Machine Ethics Test with respect to self-driving cars and the trolley problem. In particular, they explore whether there should be a Mandatory Ethics Setting (MES) for all autonomous cars, or whether each driver should have their own customizable Personal Ethics Setting (PES). Each has its pros and cons. The PES, for example, would allow an elderly couple to select themselves as willing to sacrifice themselves in the event of a ‘Tunnel Problem’. The Tunnel Problem (Millar, 2014) is a variation on the Trolley Problem in which a self-driving car is approaching a tunnel when a child runs out onto the road. A choice must be made to actively kill either the child or the driver. In another scenario, a father and provider for his family might select to never sacrifice himself. The authors raise the same issues as Patrick Lin (2016) about the dangers of discrimination. There is also the problem of putting too much moral pressure on the driver; A PES might lead to a prisoner’s dilemma in which the individual may not choose the option for the greater good, but rather for their own self-interest. The authors conclude that a MES would be preferable, as it would eliminate the prisoner’s dilemma

There are several organizations that have been formed to monitor the development of artificial intelligence. The Future of Life Institute (FLI) is a volunteer-run research and outreach organization that works to mitigate existential risks facing humanity, particularly existential risk from advanced artificial intelligence. Its founders include MIT cosmologist Max Tegmark, Skype co-founder Jaan Tallinn, and its board of advisors includes cosmologist Stephen Hawking and entrepreneur Elon Musk. The Responsible Robotics Group’s mission is to shape the future of robotics design, development, use, regulation and implementation, and The Global initiative on Ethical Autonomous Systems, sponsored by the IEEE, touts itself as “An incubation space for new standards and solutions, certifications and codes of conduct, and consensus building for ethical implementation of intelligent technologies” (“The IEEE Global Initiative on Ethics”, n.d). To ensure every stakeholder involved in the design and development of autonomous and intelligent systems is educated, trained, and empowered to prioritize ethical considerations so that these technologies are advanced for the benefit of humanity. These organizations and others can help insure that self-driving car companies are programming their vehicles to minimize death, injury and damage to property.


Car accidents kill hundreds of thousands of people every year and self-driving cars have the ability to prevent up to 90% of these deaths. The difficulty lies in programming the operating systems with ethics in an effort to prevent the accidents which will inevitably occur. The goal of this paper is to determine an optimal strategy for controlling the behaviour of autonomous cars to ensure safety and fair decision-making. This is done by examining the challenges of encoding ethics in autonomous systems. The Trolley Problem is introduced and criticisms of its use as an ethics template presented. A number of approaches to actually programming ethics into computers are discussed. They range from top-down rules-based programming to bottom-up machine-learning strategies, to an optimal strategy which combines elements of both and possibly additional factors. The Virtue Theory is a useful starting point, as virtues can explicitly described. The machine-learning piece would involve the installation of cases which demonstrate ethically correct behaviour. Exposure to data from car crash simulation software, or even from actual crashes, could also augment machine-learning. Feedback from drivers and artificial intelligence monitors could also optimize decision-making.

Anderson and Anderson (2010) point out that, intriguingly, machine ethics could end up influencing the study of ethics as a whole. The ‘real world’ perspective of machine learning of morally appropriate conduct could get closer to capturing what counts as ethical behavior in people than does the abstract theorizing of academic ethicists. It is likely that properly trained machines might even behave more ethically than many human beings would, because they would be capable of making calculated impartial decisions, something inherently self-interested humans might not. The fear of self-driving cars might be misplaced because they may turn out to be safer in many respects than human drivers. The authors speculate that interacting with an ethical robot might someday even inspire us to behave more ethically ourselves. Similar sentiments are expressed by Mark Riedl of the Entertainment Intelligence Lab (Parkin, 2017) “We are never going to have a perfect self-driving car. It’s going to have accidents. But it is going to have fewer accidents than a human. So… our goal should be no worse than humans. Just maybe, it could be possible to be better than humans”(pg. 69).

1 Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.