Formal methods as a path toward better cybersecurity

Carstens_Unsplash_Cyber Coding

Five years ago, cybersecurity researchers accomplished a rare feat. A team at the Pentagon’s far-out research arm, the Defense Advanced Research Projects Agency (DARPA), loaded special software into a helicopter’s flight control computer. Then they invited expert hackers to break into the software. After repeated attempts, the flight control system stood strong against all attempts to gain unauthorized control. 

This outcome was unusual. Experienced hackers who are given direct, privileged access to software almost always find a way in. The reason is simple. Decades after the birth of computer programming, modern software products are riddled with flaws, many of which create security vulnerabilities that attackers can easily exploit to slip through digital defenses. This is why reducing the error rate in software code is essential to turn the tide against relentless computer criminals and foreign adversaries that steal wealth and menace critical infrastructure with relative impunity. 

How was DARPA’s custom flight control software able to shrug off its assailants? The researchers turned to formal methods, a frequently overlooked group of technologies that programmers can use to create ultra-secure, ultra-reliable software. DARPA’s experiment is one of several examples that underscore the potential for formal methods to remake software security. They herald a not-too-distant future when radically safer, more secure software can allow us to embrace other emerging technologies without catastrophic consequences. 

What are formal methods?

Before it is ready for primetime, any piece of software should be able to satisfy at least two criteria:

  1. Under normal conditions, the software provides the desired features (e.g., you can edit, save, and share files); and
  2. When errors do occur, the software handles them gracefully (i.e., the program does not crash your whole computer system, leak private data, or give control to someone else).

Because most customers use software as it is intended, software programmers devote most of their attention to satisfying the first criteria: ensuring the software works properly under normal conditions. This is relatively easy to evaluate through user feedback, as customers tend to be vocal when a piece of software obviously misbehaves or omits an advertised feature.

The second dimension is much trickier—and the bane of the cybersecurity community. Virtually all software code contains defects that can cause the software to fail in some way. Humans write software, and our species is naturally prone to mistakes. Larger and more complex software applications multiply opportunities for committing and overlooking errors by orders of magnitude. Human minds excel at creating highly capable software (the first criteria), but they are ill-equipped to identify and eliminate software defects. One defect might be harmless, while another might crash the entire program. Others can lead to security failures. These happen when human attackers purposefully exploit defects to cause a specific kind of software failure that achieves their own objectives, such as leaking private data or giving control to them, the attacker.

The software industry has coalesced around two methods for reducing the error rate in software code. The first is simply education and knowledge exchange. Experts collaborate on a global scale to share information about software vulnerabilities and how to fix them. Yet as computing has matured, this body of knowledge has become overwhelming. It is extremely challenging to know when to apply lessons learned and how to verify implementation. Another common method to improve software quality is intensive testing. Yet this can consume massive resources, and most testing only indicates the presence of defects—it cannot prove the absence of them. Given the context of cybersecurity, where attackers actively hunt for any possible flaw that designers overlooked, these two methods have proved insufficient in solving software security.

Formal methods encompass a group of technologies that aim to manage these problems much more effectively by supplementing human resources with computational resources. In the 1970s, when it became clear that computers would become the foundation of military power, the defense community realized it needed greater assurance that its software was of the highest quality and free of security issues. Early programmers knew they could not rely on human judgment alone to ferret out all possible security vulnerabilities. They needed ways to prove that critical pieces of software would not crash by accident or contain unknown security flaws that attackers could exploit. They wanted to maximize confidence that a specific software application would do only what its authorized users intended, and nothing else. 

What is the best way to prove an objective truth? Math and logic. The pioneers of formal methods adapted mathematical logic to construct abstract representations of software (“I want this program to do X”) and then use advanced mathematical theorems to prove that the software code they wrote would only accomplish X.

The term formal methods evolved over time, and today it represents a spectrum of sophistication, from relatively simple instructions for planning a software project to automated programs that function like a super spell-check for code. The method of analysis varies between different formal methods, but it is largely automated and ideally carried out with mathematical precision. Lightweight formal methods are already in widespread use today. Static type theory, for example, has become a standard feature in several programming languages. These methods require no specialized knowledge to use and provide low-cost protection against common software faults.

More sophisticated formal methods can prevent more complex problems. Formal verification is one such example and it enables programmers to prove that their software does not contain certain errors and behaves exactly according to specification. These more advanced methods tend to require specialized knowledge to apply, but with recent advances this bar has been coming down. (For a more detailed description of different types of formal methods, curious readers should read pages 6-9 of this report by the National Institute of Standards and Technology.)

Problems with formal methods and recent innovation

Like the neural networks that revolutionized artificial intelligence, formal methods are a technology undergoing a renaissance after spending decades in the shadows. As software became more complicated, applying the more advanced tools for proving code—the ones that could provide the highest assurance that security vulnerabilities were absent—became exponentially more difficult. As the National Institute of Standards and Technology explains, formal methods “developed a reputation as taking far too long, in machine time, person years and project time, and requiring a PhD in computer science and mathematics to use them.” For a long time, formal methods were relegated to mission-critical use cases, such as nuclear weaponry or automotive systems, where designers were willing to devote immense time and resources to creating error free software. But research into formal methods continued, led by a dedicated corps of experts in academia, federal research institutions, and a handful of specialized companies. 

More recent developments, including DARPA’s helicopter project, suggest formal methods are poised to remake how we design software and transform cybersecurity. In November 2016, the National Institute for Standards and Technology delivered a comprehensive report to the White House recommending alternative ways to achieve a “dramatic” reduction in software vulnerabilities. Devoting six pages to formal methods, the report noted that “formal methods have become mainstream in many behind-the-scenes applications and show significant promise for both building better software and for supporting better testing.” 

Leading technology companies have quietly rolled out formal methods in their core businesses. Amazon Web Services (AWS), arguably one of the most important infrastructure providers on the planet, has an entire team that uses formal methods to create “provable security” for its customers. Facebook has shown how formal verification techniques can be integrated into a “move fast and break things” approach with its INFER system, which continuously verifies the code in every update for its mobile applications. Microsoft has also stood up its own dedicated team on formal verification. As one team member explained last year, “Proving theorems about programs has been a dream of computer science for the last 60 years or more, and we’re finally able to do this at the scale required for an important, widely deployed security-critical piece of software.” And it is not just Big Tech. Specialty companies like Galois, Synopsys, and MathWorks are creating a more competitive market for sophisticated formal methods solutions that companies of various sizes can put to work.

Looking forward, the National Science Foundation’s ongoing DeepSpec Expedition in Computing has demonstrated the applicability of these methods to increasingly complex engineering tasks, including the development of entire operating systems (which tend to be much larger than single applications), database engines, memory managers, and other essential computing and software components. These successes represent a significant step forward for the field, which has long sought to find reliable, low-cost/low-time methods for engineering such components.

These clear signs of progress notwithstanding, the most sophisticated types of formal methods—such as full-blown formal verification—are still a long way from becoming a go-to tool for the average software developer. The organizations listed above are not representative, of course, and challenges still remain to bring formal methods to the rest of the software industry. We need an ecosystem of tools, more training for working engineers, and more consensus on when to deploy which methods. We also need to begin changing the way software standards committees publish their work; instead of prose, they should begin publishing formal models that allow the application of formal methods. Lastly, we need to begin educating technology decisionmakers about these capabilities and their ramifications.

There are at least two reasons why industry and government should seize on ongoing innovations in the field and accelerate adoption.

First, unlike many cybersecurity measures, proper application of formal methods does not only drive costs up. Since formal methods reduce overall defect count in software, systems built with formal methods can require less maintenance and thus be cheaper to operate than today’s ad-hoc alternatives. Additionally, further improvements in automation are expected to provide these benefits without adding significant cost to the initial engineering efforts. Whereas as most security measures drive costs and hurt profit margins, proper use of formal methods can help defeat attackers while improving the bottom line. Even where software is too complicated to use formal verification—the most robust weapon in the formal methods arsenal—much more basic formal methods can still lower software lifecycle costs simply by enforcing more rigorous development practices that some software developers know, but don’t use. 

Second, the steady drumbeat for software liability may soon change the cost calculus for software developers who have traditionally not born all the costs of unreliable, flawed software. The final report issued by Congress’ Cyberspace Solarium Commission recommended that Congress should pass a law establishing liability for “final goods assemblers” of software that contains “known and unpatched” vulnerabilities. Some types of formal methods offer clear opportunities to establish more objective standards of care for determining such liability. 

Just one bug in one line of code

Today, we have a global software industry that frequently creates software in an ad-hoc manner, churning out products without truly knowing what is in them, how they might fail, and what will happen if they do. The situation was tolerable when software did not run the world but computing now either controls or informs nearly every aspect of the economy, politics, and social life. And because the individual components that make up a larger software program are interdependent, even a single error in any phase of the manual development process—design, implementation, testing, evaluation, operation, maintenance—can be catastrophic. One bug in one line of code can create a security vulnerability that spans millions of computer systems, enabling data theft and digital disruption on a massive scale. 

Formal methods are not the ultimate answer to cybersecurity. Even their most sophisticated manifestation, formal verification, cannot guarantee perfect security. Neither can the world’s best engineers guarantee that a skyscraper will not collapse. But through rigorous standards, objective testing, and the scientific method, they have achieved an outstanding record. By injecting similar rigor into the software industry, formal methods can, at the very least, give us much higher assurance that digital technology will behave. 

Tim Carstens is an adviser at the Cyber Independent Testing Lab.
David Forscey is the managing director of the Aspen Cybersecurity Group. He previously worked in the Center for Best Practices at the National Governors Association and Third Way.

Amazon, Facebook, and Microsoft provide financial support to the Brookings Institution, a nonprofit organization devoted to rigorous, independent, in-depth public policy research.