How to Use Big Data to Combat Ebola

The economists Erik Brynjolfsson and Andrew McAfee have compared the fine-grained insights possible from computational research – Big Data – to the transformation in knowledge that occurred when Antonie van Leeuwenhoek first peered at samples through his new invention, the microscope.

In light of this comparison, it is fitting that perhaps nowhere is the potential benefit of big data so compelling as in health care. As the President’s Council of Advisors on Science & Technology put it in the White House Big Data inquiry, “millions of health records (including analog data such as scans in addition to digital data), vast amounts of genomic information, extensive data on successful and unsuccessful clinical trials, hospital records, and so forth” will enable researchers to understand disease at new levels of detail. Because health information is so sensitive, health care also is an area where realizing the benefits of big data requires getting privacy right.

The Ebola outbreak in West Africa has put a spotlight on the potential for data to help in understanding the spread by using mobile phone records. African countries are not data-rich environments but, for every 100 inhabitants, there are 89 mobile phones. The metadata in call records maintained by mobile phone operators – who called whom, at what time, and from where – offers a rich source of data that can be used to track, among other things, importation routes for infectious diseases, patterns of migration, or economic transactions. But efforts to share this data in the fight against Ebola have run into roadblocks.

A Brookings paper I wrote with MIT Ph.D. candidate Yves-Alexandre de Montjoye and Gates Foundation economist Jake Kendall looks at some of the obstacles. Our work was part of a working group of the Big Data@MIT initiative looking at various Big Data use cases to consider the privacy issues involved and how technology and other tools could address them.

We examined two scenarios of mobile phone data for development that are quite distinct from a regulatory and privacy perspective. One, modeled on previous research, involved the use of location metadata to track the spread of infectious diseases (e.g. malaria or Ebola) within and among countries. The second case considered the use of mobile phone data to define subgroups based on specific traits and behaviors, and then micro-target outreach for interventions. We also considered limited circumstances where the data might be used to select specific individuals to be identified and contacted directly.

These mobile phone data case studies revealed ways in which, despite the promise, regulatory barriers and privacy challenges are preventing the use of mobile phone metadata from realizing its full potential. More specifically, our analysis showed (1) the lack of commonly-accepted practices for sharing mobile phone data in privacy-conscientious ways and (2) an uncertain and country-specific regulatory landscape for data-sharing especially for cross-border data sharing.

Our article makes several recommendations to facilitate the use of mobile phone metadata for humanitarian purposes like Ebola in ways that will protect against the misuse of information:

  1. There is a clear need for companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models in different development use cases – a wider and higher-level discussion of the kind our MIT working group conducted. Such best practices would help carriers and policymakers strike the right balance between privacy and utility in the use of metadata, and make it easier and less risky for carriers to support humanitarian and research uses, and for researchers and NGOs to use metadata appropriately.
  2. Such best practices should accept that there are no perfect ways to de-identify data – and probably never will be.There will always be some risk that must be balanced against the public good that can be achieved. While much more research is needed in computational privacy, widespread adoption of existing techniques as standards could enable this trend of sharing data in a privacy-conscientious way.
  3. Standards and practices as well as legal regulation also need to address and incorporate trust mechanisms for humanitarian sharing of data in a more nuanced way. The recognition of trusted third-parties and systems to manage datasets, enable detailed audits, and control the use of data could enable greater sharing of more useful data among multiple parties while providing a barrier against risks.
  4. There is a need for governments to focus on adopting laws and rules that simplify the collection and use of mobile phone metadata for research and public good purposes. Governments should also seek to harmonize laws on the sharing of metadata with common identifiers across national borders. Clear and consistent rules will help but only provided they take a pragmatic and privacy-conscientious approach to anonymization, cross-border transfers, and novel uses that enable public good uses of data and allow for public health emergencies and other valuable research.

Research based on mobile phone data, computational privacy, and data protection rules all may seem secondary when confronted by the challenges of poverty, disease, and basic economic growth. But they are on the critical path to realizing the great potential of information technology to help address these critical problems.