The tensions between explainable AI and good public policy

A group of British students protest the use of algorithms to predict their grades on standardized exams.

Democratic governments and agencies around the world are increasingly relying on artificial intelligence. Police departments in the United States, United Kingdom, and elsewhere have begun to use facial recognition technology to identify potential suspects. Judges and courts have started to rely on machine learning to guide sentencing decisions. In the U.K., one in three British local authorities are said to be using algorithms or machine learning (ML) tools to make decisions about issues such as welfare benefit claims. These government uses of AI are widespread enough to wonder: Is this the age of government by algorithm?

Many critics have expressed concerns about the rapidly expanding use of automated decision-making in sensitive areas of policy such as criminal justice and welfare. The most often voiced concern is the issue of bias: When machine learning systems are trained on biased data sets, they will inevitably embed in their models the data’s underlying social inequalities. The data science and AI communities are now highly sensitive to data bias issues, and as a result have started to focus far more intensely on the ethics of AI. Similarly, individual governments and international organizations have published statements of principle intended to govern AI use.

A common principle of AI ethics is explainability. The risk of producing AI that reinforces societal biases has prompted calls for greater transparency about algorithmic or machine learning decision processes, and for ways to understand and audit how an AI agent arrives at its decisions or classifications. As the use of AI systems proliferates, being able to explain how a given model or system works will be vital, especially for those used by governments or public sector agencies.

Yet explainability alone will not be a panacea. Although transparency about decision-making processes is essential to democracy, it is a mistake to think this represents an easy solution to the dilemmas algorithmic decision-making will present to our societies.

There are two reasons why. First, with machine learning in general and neural networks or deep learning in particular, there is often a trade-off between performance and explainability. The larger and more complex a model, the harder it will be to understand, even though its performance is generally better. Unfortunately, for complex situations with many interacting influences—which is true of many key areas of policy—machine learning will often be more useful the more of a black box it is. As a result, holding such systems accountable will almost always be a matter of post hoc monitoring and evaluation. If it turns out that a given machine learning algorithm’s decisions are significantly biased, for example, then something about the system or (more likely) the data it is trained on needs to change. Yet even post hoc auditing is easier said than done. In practice, there is surprisingly little systematic monitoring of policy outcomes at all, even though there is no shortage of guidance about how to do it.

The second reason is due to an even more significant challenge. The aim of many policies is often not made explicit, typically because the policy emerged as a compromise between people pursuing different goals. These necessary compromises in public policy presents a challenge when algorithms are tasked with implementing policy decisions. A compromise in public policy is not always a bad thing; it allows decision makers to resolve conflicts as well as avoiding hard questions about the exact outcomes desired. Yet this is a major problem for algorithms as they need clear goals to function. An emphasis on greater model explainability will never be able to resolve this challenge.

Consider the recent use of an algorithm to produce U.K. high school grades in the absence of examinations during the pandemic, which provides a remarkable example of just how badly algorithms can function in the absence of well-defined goals. British teachers had submitted their assessment of individual pupils’ likely grades and ranked their pupils within each subject and class. The algorithm significantly downgraded many thousands of these assessed results, particularly in state schools in low-income areas. Star pupils with conditional university places consequently failed to attain the level they needed, causing much heartbreak, not to mention pandemonium in the centralized system for allocating students to universities.

After a few days of uproar, the U.K. government abandoned the results, instead awarding everyone the grades their teachers had predicted. When the algorithm was finally published, it turned out to have placed most weight on matching the distribution of grades the same school had received in previous years, penalizing the best pupils at typically poorly performing schools. However, small classes were omitted as having too few observations, which meant affluent private schools with small class sizes escaped the downgrading.

Of course, the policy intention was never to increase educational inequality, but to prevent grade inflation. This aim had not been stated publicly beforehand—or statisticians might have warned of the unintended consequences. The objectives of no grade inflation, school by school, and of individual fairness were fundamentally in conflict. Injustice to some pupils—those who had worked hardest to overcome unfavorable circumstances—was inevitable.

For government agencies and offices that increasingly rely on AI, the core problem is that machine learning algorithms need to be given a precisely specified objective. Yet in the messy world of human decision-making and politics, it is often possible and even desirable to avoid spelling out conflicting aims. By balancing competing interests, compromise is essential to the healthy functioning of democracies.

This is true even in the case of what might at first glance seem a more straightforward example, such as keeping criminals who are likely to reoffend behind bars rather than granting them bail or parole. An algorithm using past data to find patterns will—given the historically higher likelihood that people from low income or minority communities will have been arrested or imprisoned—predict that similar people are more likely to offend in future. Perhaps judges can stay alert for this data bias and override the algorithm when sentencing particular individuals.

But there is still an ambiguity about what would count as a good outcome. Take bail decisions. About a third of the U.S. prison population is awaiting trial. Judges make decisions every day about who will await trial in jail and who will be bailed, but an algorithm can make a far more accurate prediction than a human about who will commit an offense if they are bailed. According to one model, if bail decisions were made by algorithm, the prison population in the United States would be 40% smaller, with the same recidivism rate as when the decisions are made by humans. Such a system would reduce prison populations—an apparent improvement on current levels of mass incarceration. But given that people of color make up the great majority of the U.S. prison population, the algorithm may also recommend a higher proportion of people from minority groups are denied bail—which seems to perpetuate unfairness.

Some scholars have argued that exposing such trade-offs is a good thing. Algorithms or ML systems can then be set more specific aims—for instance, to predict recidivism subject to a rule requiring that equal proportions of different groups get bail—and still do better than humans. What’s more, this would enforce transparency about the ultimate objectives.

But this is not a technical problem about how to write computer code. Perhaps greater transparency about objectives could eventually be healthy for our democracies, but it would certainly be uncomfortable. Compromises work by politely ignoring inconvenient contradictions. Should government assistance for businesses hit by the pandemic go to those with most employees or to those most likely to repay? There is no need to answer this question about ultimate aims in order to set specific criteria for an emergency loan scheme. But to automate the decision requires specifying an objective—save jobs, maximize repayments, or perhaps weight each equally. Similarly, people might disagree about whether the aim of the justice system is retribution or rehabilitation and yet agree on sentencing guidelines.

Dilemmas about objectives do not crop up in many areas of automated decisions or predictions, where the interests of those affected and those running the algorithm are aligned. Both the bank and its customers want to prevent frauds, both the doctor and her patient want an accurate diagnosis or radiology results. However, in most areas of public policy there are multiple overlapping and sometimes competing interests.

There is often a trust deficit too, particularly in criminal justice and policing, or in welfare policies which bring the power of the state into people’s family lives. Even many law-abiding citizens in some communities do not trust the police and judiciary to have their best interests at heart. It is naïve to believe that algorithmically enforced transparency about objectives will resolve political conflicts in situations like these. The first step, before deploying machines to make decisions, is not to insist on algorithmic explainability and transparency, but to restore the trustworthiness of institutions themselves. Algorithmic decision-making can sometimes assist good government but can never make up for its absence.

Diane Coyle is professor of public policy and co-director of the Bennett Institute at the University of Cambridge.