This report from The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative is part of “AI Governance,” a series that identifies key governance and norm issues related to AI and proposes policy remedies to address the complex challenges associated with emerging technologies.
Affective computing is an interdisciplinary field that uses algorithms to analyze bodies, faces, and voices to infer human emotion and state of mind. Although not widely in use, law enforcement agencies and companies are experimenting with using affective computing to extract personality information, detect deception, and identify criminal behavior. Yet, there is insufficient evidence that these technologies work reliably enough to be used for the high stakes of law enforcement. Even worse, they threaten core American principles of civil liberty in a pluralistic society by presuming that facial movements, physical reactions, and tone of voice can be evidence of criminality. The Biden administration should publicly and unequivocally reject this perspective by banning the use of affective computing in federal law enforcement.
Affective computing includes a broad set of technologies that use data and algorithms to recognize and influence human emotions. Many questions remain unanswered, but there are plausible valuable contributions of affective computing that warrant further research. For instance, audio recordings might help identify veterans who may be at risk of committing suicide. Driver monitoring systems using affective computing could feasibly warn exhausted drivers and reduce traffic fatalities. The Woebot app has attracted attention for offering automated therapy using affective computing—although the efficacy of this approach is unknown. There can be honest debate over the efficacy of these applications, and the extent to which they have been rushed into commercial use ahead of the science. Yet the tasks that interest law enforcement—such as lie detection and detecting criminal behavior—are clearly beyond the capacity of affective computing.
Available evidence suggests that affective computing is not effective enough to be used in law enforcement. In an evaluation by the Canada Border Services Agency, an experimental automated interviewing system called AVATAR performed dismally as a lie detector. Despite taking over one million measurements in each interview—including eye tracking, facial movements, and vocal features—AVATAR was unable to reliably identify deceit. This evaluation even gave the affecting computing system an absurd handicap by using the same data for both model development and evaluation, which typically leads to overly optimistic results. Psychologists are quick to note that there is no scientific basis to indicate body language, facial expressions, and vocal pitch are even indicative of deception. Journalistic explorations of similar commercial interviewing software found them to be confused by interviewees wearing glasses or by adding bookshelves in the background. Another commercial system failed to recognize the interviewee was not speaking English when attempting to rate their English competency. A broader review of the scientific literature suggests that inferring emotion from facial expressions is unreliable.
Despite the high stakes and lack of evidence of efficacy, there is reason to be concerned that law enforcement will implement affective computing. The Guardian has reported that “dozens of local police departments in the US” use EyeDetect, which dubiously purports to detect deception with eye tracking. In the UK, a county police force is experimenting with a system that claims to identify “anger and distress” in addition to performing facial recognition. At the federal level, the Department of Homeland Security and other agencies funded and tested the AVATAR system, although it seems both the U.S. and Canada wisely decided against using it. The EU spent around five million euros on research for an immigration security system called iBorderCtrl, which also attempted lie detection using facial features and movements. While some local law enforcement has been less hesitant, it seems that federal law enforcement agencies have largely restrained from deploying affecting computing systems, despite some testing and experimenting.
Without a ban, there will always be an ongoing possibility that some agencies or departments will overestimate the power of affective computing and inappropriately apply the technology. Recent evidence from the use of facial recognition should raise alarms. Just last month, a report from the Government Accountability Office (GAO) found that thirteen federal law enforcement agencies are not tracking or providing guidance for the use of non-governmental facial recognition systems—only one agency was. What’s more, reporting from Buzzfeed suggests the GAO review missed five additional law enforcement agencies that claimed not to use the most notorious facial recognition software, Clearview AI, despite appearing in Clearview’s data. Facial recognition does work, but its widely documented limitations should have led law enforcement to catalog its use and develop best practices in advance of its widespread application—this did not happen.
That affective computing software is easy to access also enables its potential adoption. The Google, Amazon, and Microsoft cloud services all claim to enable some form of emotion detection from facial images, while IBM sells a tone analyzer. Some signs, such as the $74.5 million acquisition of Affectica, suggest the affective computing industry is growing rapidly, which means its tools will become easier to find and use. Some companies already claim they can identify behavioral patterns that indicate criminal intent—an incredibly dubious assertion—and are explicitly advocating for use of these systems by law enforcement. These companies are aided by the spread of facial recognition systems, which establish video cameras and data systems that can accommodate affective computing software. There is also a tremendous amount of research around affective computing, which creates an opportunity for companies to spin narratives with selective evidence for their specific application.
A wider ban of affective computing for high-stakes decisions, such as in hiring and college admissions, would be delayed by need for congressional approval. A complete ban in all federal agencies might also undermine the potential for its valuable use. However, there is nothing preventing President Biden from issuing an executive order that bans the use of affective computing in federal law enforcement agencies.
“From everything we know about AI, it is also safe to infer [affective computing] would be especially harmful to minorities and people with disabilities”
There are quite a few advantages to a targeted federal law enforcement ban. Foremost, this would preempt the use of an unproven technology by federal officers, which would be harmful even if it was rare. From everything we know about AI, it is also safe to infer this would be especially harmful to minorities and people with disabilities. Beyond federal agencies, a ban would also send a signal to state and local law enforcement that the federal government believed the evidence does not justify using affective computing. It may also deter the would-be Clearview AI’s of the world—companies that might take drastic steps to develop and sell affective computing software while downplaying potential public harms. Furthermore, the use of affective computing by law enforcement hinges on the fundamental premise that something expressive that a person does with their body, face, or voice, is evidence of wrong-doing or even criminality. This would be prototypically dystopian, if not for its present use in China, and certainly has no place in a country committed to civil liberty. Less tangibly, but no less valuably, this ban would be a clear signal—both to its own citizens and to the rest of the world—that the United States stands only for the safe and ethical use of AI.
Public policy in the United States is loath to implement the precautionary principle, in which a technology is banned before its harm is widely demonstrated, but the argument in this case is clear. Since affective computing is concerned with emotional and personality analysis, this ban would not prevent the use of speech-to-text transcription, weapon detection, or facial recognition.
In the future, if independent research demonstrates validity and efficacy of a specific affective technology, an exception could easily be made. Until then, by banning affecting computing in federal law enforcement, President Biden has an opportunity to set a positive example to the United States and to the world about the ethical use of high-stakes artificial intelligence technologies.
The Brookings Institution is a nonprofit organization devoted to independent research and policy solutions. Its mission is to conduct high-quality, independent research and, based on that research, to provide innovative, practical recommendations for policymakers and the public. The conclusions and recommendations of any Brookings publication are solely those of its author(s), and do not reflect the views of the Institution, its management, or its other scholars.
Microsoft provides support to The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative, and Amazon, Google, and IBM provide general, unrestricted support to the Institution. The findings, interpretations, and conclusions in this report are not influenced by any donation. Brookings recognizes that the value it provides is in its absolute commitment to quality, independence, and impact. Activities supported by its donors reflect this commitment.