How to improve cybersecurity for artificial intelligence

U.S. Department of Homeland Security election security workers monitor screens in the DHS National Cybersecurity and Communications Integration Center (NCCIC) in Arlington, Virginia, U.S. November 6, 2018. REUTERS/Jonathan Ernst
Editor's note:

This report from The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative is part of “AI Governance,” a series that identifies key governance and norm issues related to AI and proposes policy remedies to address the complex challenges associated with emerging technologies.

In January 2017, a group of artificial intelligence researchers gathered at the Asilomar Conference Grounds in California and developed 23 principles for artificial intelligence, which was later dubbed the Asilomar AI Principles. The sixth principle states that “AI systems should be safe and secure throughout their operational lifetime, and verifiably so where applicable and feasible.” Thousands of people in both academia and the private sector have since signed on to these principles, but, more than three years after the Asilomar conference, many questions remain about what it means to make AI systems safe and secure. Verifying these features in the context of a rapidly developing field and highly complicated deployments in health care, financial trading, transportation, and translation, among others, complicates this endeavor.

Much of the discussion to date has centered on how beneficial machine learning algorithms may be for identifying and defending against computer-based vulnerabilities and threats by automating the detection of and response to attempted attacks. Conversely, concerns have been raised that using AI for offensive purposes may make cyberattacks increasingly difficult to block or defend against by enabling rapid adaptation of malware to adjust to restrictions imposed by countermeasures and security controls. These are also the contexts in which many policymakers most often think about the security impacts of AI. For instance, a 2020 report on “Artificial Intelligence and UK National Security” commissioned by the U.K.’s Government Communications Headquarters highlighted the need for the United Kingdom to incorporate AI into its cyber defenses to “proactively detect and mitigate threats” that “require a speed of response far greater than human decision-making allows.”

A related but distinct set of issues deals with the question of how AI systems can themselves be secured, not just about how they can be used to augment the security of our data and computer networks. The push to implement AI security solutions to respond to rapidly evolving threats makes the need to secure AI itself even more pressing; if we rely on machine learning algorithms to detect and respond to cyberattacks, it is all the more important that those algorithms be protected from interference, compromise, or misuse. Increasing dependence on AI for critical functions and services will not only create greater incentives for attackers to target those algorithms, but also the potential for each successful attack to have more severe consequences.

“Increasing dependence on AI for critical functions and services will not only create greater incentives for attackers to target those algorithms, but also the potential for each successful attack to have more severe consequences.”

This policy brief explores the key issues in attempting to improve cybersecurity and safety for artificial intelligence as well as roles for policymakers in helping address these challenges. Congress has already indicated its interest in cybersecurity legislation targeting certain types of technology, including the Internet of Things and voting systems. As AI becomes a more important and widely used technology across many sectors, policymakers will find it increasingly necessary to consider the intersection of cybersecurity with AI. In this paper, I  describe some of the issues that arise in this area, including the compromise of AI decision-making systems for malicious purposes, the potential for adversaries to access confidential AI training data or models, and policy proposals aimed at addressing these concerns.

Securing AI decision-making systems

One of the major security risks to AI systems is the potential for adversaries to compromise the integrity of their decision-making processes so that they do not make choices in the manner that their designers would expect or desire. One way to achieve this would be for adversaries to directly take control of an AI system so that they can decide what outputs the system generates and what decisions it makes. Alternatively, an attacker might try to influence those decisions more subtly and indirectly by delivering malicious inputs or training data to an AI model.

For instance, an adversary who wants to compromise an autonomous vehicle so that it will be more likely to get into an accident might exploit vulnerabilities in the car’s software to make driving decisions themselves. However, remotely accessing and exploiting the software operating a vehicle could prove difficult, so instead an adversary might try to make the car ignore stop signs by defacing them in the area with graffiti. Therefore, the computer vision algorithm would not be able to recognize them as stop signs. This process by which adversaries can cause AI systems to make mistakes by manipulating inputs is called adversarial machine learning. Researchers have found that small changes to digital images that are undetectable to the human eye can be sufficient to cause AI algorithms to completely misclassify those images.

An alternative approach to manipulating inputs is data poisoning, which occurs when adversaries train an AI model on inaccurate, mislabeled data. Pictures of stop signs that are labeled as being something else so that the algorithm will not recognize stop signs when it encounters them on the road is an example of this. This model poisoning can then lead an AI algorithm to make mistakes and misclassifications later on, even if an adversary does not have access to directly manipulate the inputs it receives. Even just selectively training an AI model on a subset of correctly labeled data may be sufficient to compromise a model so that it makes inaccurate or unexpected decisions.

These risks speak to the need for careful control over both the training datasets that are used to build AI models and the inputs that those models are then provided with to ensure security of machine-learning-enabled decision-making processes. However, neither of those goals are straightforward. Inputs to their machine learning systems, in particular, are often beyond the scope of control of AI developers—whether or not there will be graffiti on street signs that computer vision systems in autonomous vehicles encounter, for instance. On the other hand, developers have typically had much greater control over training datasets for their models. But in many cases, those datasets may contain very personal or sensitive information, raising yet another set of concerns about how that information can best be protected and anonymized. These concerns can often create trade-offs for developers about how that training is done and how much direct access to the training data they themselves have.

Research on adversarial machine learning has shown that making AI models more robust to data poisoning and adversarial inputs often involves building models that reveal more information about the individual data points used to train those models. When sensitive data are used to train these models, this creates a new set of security risks, namely that adversaries will be able to access the training data or infer training data points from the model itself. Trying to secure AI models from this type of inference attack can leave them more susceptible to the adversarial machine learning tactics described above and vice versa. This means that part of maintaining security for artificial intelligence is navigating the trade-offs between these two different, but related, sets of risks.

Policy proposals for AI security

In the past four years there has been a rapid acceleration of government interest and policy proposals regarding artificial intelligence and security, with 27 governments publishing official AI plans or initiatives by 2019. However, many of these strategies focus more on countries’ plans to fund more AI research activity, train more workers in this field, and encourage economic growth and innovation through development of AI technologies than they do on maintaining security for AI. Countries that have proposed or implemented security-focused policies for AI have emphasized the importance of transparency, testing, and accountability for algorithms and their developers—although few have gotten to the point of actually operationalizing these policies or figuring out how they would work in practice.

“Countries that have proposed or implemented security-focused policies for AI have emphasized the importance of transparency, testing, and accountability for algorithms and their developers.”

In the United States, the National Security Commission on Artificial Intelligence (NSCAI) has highlighted the importance of building trustworthy AI systems that can be audited through a rigorous, standardized system of documentation. To that end, the commission has recommended the development of an extensive design documentation process and standards for AI models, including what data is used by the model, what the model’s parameters and weights are, how models are trained and tested, and what results they produce. These transparency recommendations speak to some of the security risks around AI technology, but the commission has not yet extended them to explain how this documentation would be used for accountability or auditing purposes. At the local government level, the New York City Council established an Automated Decision Systems Task Force in 2017 that stressed the importance of security for AI systems; however, the task force provided few concrete recommendations beyond noting that it “grappled with finding the right balance between emphasizing opportunities to share information publicly about City tools, systems, and processes, while ensuring that any relevant legal, security, and privacy risks were accounted for.”

A 2018 report by a French parliamentary mission, titled “For a Meaningful Artificial Intelligence: Towards a French and European Strategy,” offered similarly vague suggestions. It highlighted several potential security threats raised by AI, including manipulation of input data or training data, but concluded only that there was a need for greater “collective awareness” and more consideration of safety and security risks starting in the design phase of AI systems. It further called on the government to seek the “support of specialist actors, who are able to propose solutions thanks to their experience and expertise” and advised that the French Agence Nationale pour la Sécurité des Systèmes d’information (ANSSI) should be responsible for monitoring and assessing the security and safety of AI systems. In a similar vein, China’s 2017 New Generation AI Development Plan proposed developing security and safety certifications for AI technologies as well as accountability mechanisms and disciplinary measures for their creators, but the plan offered few details as to how these systems might work.

For many governments, the next stage of considering AI security will require figuring out how to implement ideas of transparency, auditing, and accountability to effectively address the risks of insecure AI decision processes and model data leakage.

Transparency will require the development of a more comprehensive documentation process for AI systems, along the lines of the proposals put forth by the NSCAI. Rigorous documentation of how models are developed and tested and what results they produce will enable experts to identify vulnerabilities in the technology, potential manipulations of input data or training data, and unexpected outputs.

Thorough documentation of AI systems will also enable governments to develop effective testing and auditing techniques as well as meaningful certification programs that provide clear guidance to AI developers and users. These audits would, ideally, leverage research on adversarial machine learning and model data leakage to test AI models for vulnerabilities and assess their overall robustness and resilience to different forms of attacks through an AI-focused form of red teaming. Given the dominance of the private sector in developing AI, it is likely that many of these auditing and certification activities will be left to private businesses to carry out. But policymakers could still play a central role in encouraging the development of this market by funding research and standards development in this area and by requiring certifications for their own procurement and use of AI systems.

Finally, policymakers will play a vital role in determining accountability mechanisms and liability regimes to govern AI when security incidents occur. This will involve establishing baseline requirements for what AI developers must do to show they have carried out their due diligence with regard to security and safety, such as obtaining recommended certifications or submitting to rigorous auditing and testing standards. Developers who do not meet these standards and build AI systems that are compromised through data poisoning or adversarial inputs, or that leak sensitive training data, would be liable for the damage caused by their technologies. This will serve as both an incentive for companies to comply with policies related to AI auditing and certification, and also as a means of clarifying who is responsible when AI systems cause serious harm due to a lack of appropriate security measures and what the appropriate penalties are in those circumstances.

The proliferation of AI systems in critical sectors—including transportation, health, law enforcement, and military technology—makes clear just how important it is for policymakers to take seriously the security of these systems. This will require governments to look beyond just the economic promise and national security potential of automated decision-making systems to understand how those systems themselves can best be secured through a combination of transparency guidelines, certification and auditing standards, and accountability measures.

The Brookings Institution is a nonprofit organization devoted to independent research and policy solutions. Its mission is to conduct high-quality, independent research and, based on that research, to provide innovative, practical recommendations for policymakers and the public. The conclusions and recommendations of any Brookings publication are solely those of its author(s), and do not reflect the views of the Institution, its management, or its other scholars.

Microsoft provides support to The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative. The findings, interpretations, and conclusions in this report are not influenced by any donation. Brookings recognizes that the value it provides is in its absolute commitment to quality, independence, and impact. Activities supported by its donors reflect this commitment.


  • Footnotes
    1. Roman Yampolskiy ed., Artificial Intelligence Safety and Security, Chapman and Hall/CRC: 2018.
    2. Roman Yampolskiy, “AI Is the Future of Cybersecurity, for Better and for Worse,” Harvard Business Review, May 8, 2017. Available from
    3. Alexander Babuta, Marion Oswald and Ardi Janjeva, “Artificial Intelligence and UK National Security Policy Considerations,” Royal United Services Institute for Defence and Security Studies, April 2020. Available from
    4. Ion Stoica, Dawn Song, Raluca Ada Popa, David A. Patterson, Michael W. Mahoney, Randy H. Katz, Anthony D. Joseph, Michael Jordan, Joseph M. Hellerstein, Joseph Gonzalez, Ken Goldberg, Ali Ghodsi, David E. Culler and Pieter Abbeel, “A Berkeley View of Systems Challenges for AI,” University of California, Berkeley, Technical Report No. UCB/EECS-2017-159. October 16, 2017. Available from
    5. Anh Nguyen, Jason Yosinski, and Jeff Clune, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 427–436.
    6. Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo, “Analyzing Federated Learning through an Adversarial Lens,” Proceedings of the 36th International Conference on Machine Learning, ICML 2019 (pp. 1012-1021). Available from
    7. Reza Shokri, Marco Stronati, Congzheng Song and Vitaly Shmatikov, “Membership Inference Attacks Against Machine Learning Models,” 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, 2017, pp. 3-18, doi: 10.1109/SP.2017.41.
    8. Liwei Song, Reza Shokri, and Prateek Mittal, “Privacy Risks of Securing Machine Learning Models against Adversarial Examples,” Proceedings of the ACM SIGSAC Conference, 2019, 241-257. 10.1145/3319535.3354211.
    9. Jessica Cussins Newman, “Toward AI Security: Global Aspirations for a More Resilient Future,” Berkeley Center for Long-Term Cybersecurity, February 2019. Available from
    10. National Security Commission on Artificial Intelligence, “First Quarter Recommendations,” March 2020. Available from
    11. “New York City Automated Decision Systems Task Force Report,” November 2019. Available from