Sections

Commentary

Don’t let perfect be the enemy of good: To leverage the data revolution we must accept imperfection

Last month, we experienced yet another breakthrough in the epic battle of man against machine. Google’s AlphaGo won against the reigning Go champion Lee Sedol. This success, however, was different than that of IBM’s Deep Blue against Gary Kasparov in 1987. While Deep Blue still applied “brute force” to calculate all possible options ahead, AlphaGo was learning as the game progressed. And through this computing breakthrough that we can learn how to better leverage the data revolution.

In the game of Go, brute-force strategies don’t help because the total number of possible combinations exceeds the number of atoms in the universe. Some games, including some we played since childhood, were immune to computing “firepower” for a long time. For example, Connect Four wasn’t solved until 1995 with the conclusion being the first player can force a win. And checkers wasn’t until 2007, when Jonathan Schaeffer determined that in a perfect game, both sides could force a draw. For chess, a safe strategy has yet to be developed, meaning that we don’t know yet if white could force a win or, like in checkers, black could manage to hold on to a draw.

But most real-life situations are more complicated than chess, precisely because the universe of options is unlimited and solving them requires learning. If computers are to help, beyond their use as glorified calculators, they need to be able to learn. This is the starting point of the artificial intelligence movement.  In a world where perfection is impossible, you need well-informed intuition in order to advance. The first breakthrough in this space occurred when IBM’s Watson beat America’s Jeopardy! champions in 2011. These new intelligent machines operate in probabilities, not in certainty.

That being said, perfection remains important, especially when it comes to matters of life and death such as flying airplanes, constructing houses, or conducting heart surgery, as these areas require as much attention to detail as possible. At the same time, in many realms of life and policymaking we fall into a perfection trap. We often generate obsolete knowledge by attempting to explain things perfectly, when effective problem solving would have been better served by real-time estimates. We strive for exactitude when rough results, more often than not, are good enough.

By contrast, some of today’s breakthroughs are based on approximation. Think of Google Translate and Google’s search engine itself. The results are typically quite bad, but compared to the alternative of not having them at all, or spending hours leafing through an encyclopedia, they are wonderful. Moreover, once these imperfect breakthroughs are available, one can improve them iteratively. Only once the first IBM and Apple PCs were put on the market in the 1980s did the cycle of upgrading start, which still continues today.

In the realm of social and economic data, we have yet to reach this stage of “managed imperfection” and continuous upgrading. We are producing social and economic forecasts with solid 20th century methods. With extreme care we conduct poverty assessments and maps, usually taking at least a year to produce as they involve hundreds of enumerators, lengthy interviews and laborious data entry. Through these methods we are able to perfectly explain past events, but we fail to estimate current trends—even imperfectly.

The paradox of today’s big data era is that most of that data is poor and messy, even though the possibilities for improving it are unlimited. Almost every report from development institutions starts with a disclaimer highlighting “severe data limitations.” This is because only 0.5 percent of all the available data is actually being curated to be made usable. If data is the oil of the 21st century, we need data refineries to convert the raw product into something that can be consumed by the average person.

Thanks to the prevalence of mobile device and rapid advances in satellite technology, it is possible to produce more data faster, better, and cheaper. High-frequency data also makes it possible to make big data personal, which also increases the likelihood that people act on it. Ultimately, the breakthroughs in big data for development will be driven by managerial cultures, as has been the case with other successful ventures. Risk averse cultures pay great attention to perfection. They nurture the fear of mistakes and losing. Modern management accepts failure, encourages trial and error, and reaches progress through interaction and continuous upgrading.