The crucial need to secure the location data of vulnerable populations

A U.S. Army soldier of 5-20 Infantry Regiment attached to 82nd Airborne Division, aims his rifle in front of a bullet riddled map of Afghanistan painted on a wall of an abandoned Canadian-built school in the Zharay district of Kandahar province, southern Afghanistan June 9, 2012.  REUTERS/Shamil Zhumatov  (AFGHANISTAN - Tags: MILITARY CIVIL UNREST POLITICS)

As the Taliban moved in on the Afghan capital of Kabul with unexpected speed earlier this year, foreign aid workers, diplomats, and military personnel attempted to destroy reams of sensitive data that they’d collected over the decades-long U.S. occupation. The information included photos of smiling Afghans shaking hands with U.S. colleagues and vast stores of biometric data, which could be used to precisely identify individuals. Highly detailed biometric databases built with U.S. funding and assistance had been used to pay police and military and, in the hands of the Taliban, threatened to become a potent weapon. Afghans who’d worked with foreign governments rushed to scrub their digital identities and hide evidence of their online actions, afraid of the Taliban using cheerful social media posts against them.

In just a few short days, the Taliban’s advance transformed these vast stores of data collected in the name of development and security from valuable asset to deadly liability. It’s a tragic tale, and from the perspective of experts on data security and privacy, it’s even more tragic because it was almost entirely predictable. For years, specialists have been sounding the alarm about the dangers of collecting and failing to secure data on the world’s most vulnerable. Despite an ever-growing list of ostensibly benevolent data collection going wrong, like the recent revelation that the United Nation’s refugee agency shared un-consenting Rohingya refugees’ data with a government that has repeatedly tried to kill them, data advocates’ concerns are too often brushed off as paranoia or tiresome bureaucracy.

Now, thanks to events in Afghanistan, there is greater attention than ever on the drawbacks of data collection in humanitarian contexts. Much of the post-Taliban takeover reporting on data dangers has understandably focused on the potential misuses of the aforementioned stores of biometric data, which is used to identify people with very high reliability based on features like their fingerprints or their eyes. But amid the focus on biometric data, there has been far too little attention paid to another data-driven danger to vulnerable people: location.

The proliferation of location-data applications

In a sense, all data and privacy problems are ultimately linked to location. Biometric data is often linked to where a person exists in space—their home town, their birth place—and is used to more accurately gauge where they’re supposed to be (and makes it easier to take notice when they travel elsewhere). Groups like the Taliban will struggle to take advantage of vast quantities of biometric data if they can’t locate the right people to use it on. If you want to crack down on the opposition or imprison your political opponents, you need to be able to find them first.

Just like biometric data collection, map-making and remote sensing efforts have exploded in the past decade across essentially every sector, and intelligence and aid are no exception. As Annie Jacobsen has reported, the U.S. military has worked closely with intelligence-collection company Palantir for years in Afghanistan, combining remote sensing data collected from sources like satellites, drones, and balloons with U.S. government biometric records. “Data-driven humanitarianism” has become a hot catchphrase in the humanitarian aid sector, and mapping efforts—consider this 2021 blog post on the World Food Program’s efforts to create better maps of remote portions of Afghanistan—are no exception. Crowd-sourced mapping, which refers to volunteers building open-source maps, has been a hot topic in aid ever since the 2010 Haiti earthquakes, when volunteers with crowd-sourcing organization Ushahidi mapped out Twitter and Facebook posts visually on the platform, giving aid workers a better real-time sense of the disaster. Private industry has also leapt onboard. Since 2016, Facebook has been using machine-learning powered tools to map the entire world: While its goal is to get more people online (and thus, profitably using Facebook), the company also releases the data to aid organizations and open source mapping groups.

While these new mapping efforts are still largely written about in relatively chirpy, optimistic tones, the last decade has revealed more and more of the drawbacks of location-data collection. In 2018, open source researchers discovered that the details of U.S. military member’s jogs in sensitive areas—including in Afghanistan—were being posted to the popular fitness-tracking app Strava. It was a major security hole that U.S. intelligence had failed to anticipate. Location data’s central problem usually boils down to this: People forget that if you put people and the locations they frequent on a map —especially if you post it on the internet—other people will probably be able to see it too.

Today, news stories about the darker uses of map tools created for ostensibly benevolent or neutral purposes pop up with depressing frequency. In June, for example, Google removed maps made with its custom-mapping tool that posted the names and home addresses of critics of the Thai monarchy, putting them in direct danger. In 2020, a marketing executive was falsely identified as the racist subject of a viral video. He’d been misidentified via his Strava records.

The COVID-19 pandemic turbocharged the use of location-data in apps and tools aimed at identifying people who might have been exposed to the virus. The public-health results of these location tracking apps largely appear to be underwhelming, but they have succeeded in highlighting the dangers posed by improperly secured location data. In the United States, location-tracking apps deployed in North and South Dakota leaked user’s location data to third-party apps, Google’s supposedly anonymous contact-tracing app contained an embarrassing privacy bug, and a Pennsylvania Covid-19 tracing app leaked more than 70,000 people’s personal information on the internet (among other incidents). Many privacy and security advocates are concerned that the normalization of these tracking apps will force or encourage growing numbers of people to use them, even when the pandemic becomes less of an active threat.

Location data in humanitarian contexts

As Afghanistan fell to the Taliban, a large network of volunteers outside the country rushed to help Afghan friends, colleagues, and family members escape. They quickly developed ad hoc methods of collecting information that could be used to help people. Often, this took the form of crowdsourced maps, tracking the sites of reporting bombings and Taliban roadblocks. The security consequences of these well-intended efforts remain to be seen. In Afghanistan, many frightened city dwellers are now using the Ehtesab crisis-mapping app to navigate the increasingly dangerous streets. The Afghanistan-based company is having to balance security risks to its own staff and concerns about protecting its users against the obvious usefulness of the platform.

The broad push from the tech and aid sector to help more people get online also has dangerous location-centric consequences. While getting more people online inarguably has enormous economic and educational benefits, a person with a smartphone in an increasingly internet-centric society is a person who’s a lot easier to track down. Afghan government figures from 2019 found that 89% of Afghans could access telecommunications services, with 10 million internet users and 23 million cellphone users. As the new rulers of Afghanistan, the Taliban will have far greater access to telecom companies records on who people call and where they go. While the Taliban appears to still be mulling over how it will handle the internet, it’s unlikely it will take a liberal approach to personal privacy.

As a result of the takeover, this huge population of smartphone using at-risk Afghans (many of whom are younger people who have never known a pre-digital world) are now faced with the terrifying task of weighing their need to use the internet to keep themselves safe and connected with friends and allies, against their knowledge that the Taliban will use their digital identities and footprints against them. While it remains unclear to what extent the Taliban can actually exploit their huge new supplies of data and surveillance tech, even the suggestion that they might be able to do so is enough to spread fear and to cow people into silence and submission.

Addressing the dangers of location data

There are a number of things that we—the well-intentioned aid, government, and private sector workers of the world—can do to help prevent future data disasters and make it less likely that the data (and spatial data) that we collect can be used to harm the people we want to help.

First and most importantly, it’s time to take privacy and data security seriously. Deadly seriously. The era of brushing off the concerns of privacy specialists as boring bureaucracy that make it “harder to help people” should be officially dead following the Taliban’s takeover. As with most disaster scenarios, protecting data is an often unexciting and thankless task that becomes hugely important as soon as you actually need it. We need to move away from a culture of assuming that more data is always a good thing, and toward a culture of data minimization, in which we only collect and store just enough data to accomplish our goals (and even then, with a critical eye toward the consequences).

More and more resources dedicated to doing data protection better in aid and development are popping up. The ICRC has published a second edition of its excellent data protection handbook, which includes specific guidance on securing spatial and remotely sensed data, and the United Nations Office for the Coordination of Humanitarian Affairs recently published its operational guidance on data responsibility in humanitarian action. Organizations like The Engine Room, Access Now, and Privacy International, among others, also publish valuable resources and information.

We also need to start thinking about privacy and data protection so seriously that we treat it as a right, not as a “nice to have” or a luxury that can be pushed aside during times of disaster and warfare. In 2017, I was a co-author of “The Signal Code: A Human Rights Approach to Information During Crisis,” which posits that people affected by disaster enjoy a right to privacy and security around the data we collect about them, among other rights. If we’re committed to “doing no harm,” then protecting the data we collect is essential.

We need to do a better job of teaching people about basic online security practices, and we should start doing it before disaster strikes. Fortunately early efforts at this are underway. In August, Human Rights First published practical guides to digital security, including information on how to clean your digital history and how to evade facial recognition tech, in languages widely spoken in Afghanistan. Still, most of these resources were posted online only after the Taliban takeover in August. They might have been more helpful if they existed and if people were made aware of them before the security situation worsened. While data protection is always going to have an emergency element to it, we should change our culture to view teaching people about basic data protection as fundamental knowledge, akin to knowing what to do when there’s a fire or the basics of CPR.

Finally, we need to know way more about what harm from location data actually looks like and what forms it takes. While most of us in the data protection world have a visceral sense of how location and spatial data can be used to harm people, we still have relatively few good case studies, reports, or research papers to point to that dig into what these dangers look like in the real world. While high-resolution satellite and drone imagery can be used to identify and target people, there are relatively few resources to point to explain how this works in practice. If we don’t understand these dangers, it’s a lot harder to prevent them. In the absence of this evidence, it’s also easier for others to blow off our concerns as paranoid and overblown—forcing us all to wait to make progress until a disaster so large and well-documented as to be unignorable takes place, as just happened in Afghanistan. We need more research funding and attention paid to better understanding spatial data and the risks it presents.

Tragically, the data-driven disaster in Afghanistan will not be the last of its kind. Data-driven approaches to intelligence and aid aren’t going anywhere, but neither are violent and unexpected takeovers of places we thought were secure. While we can’t predict the future, we can do more to prevent doomed last-minute scrambles to delete and secure sensitive information. In the 21st century, protecting people’s lives also means protecting people’s data.

Faine Greenwood is a consultant and writer on civilian drone technology.