How China harnesses data fusion to make sense of surveillance data

Surveillance cameras are seen mounted in front of two Chinese flags.

Across the Chinese government’s surveillance apparatus, its many arms are busy collecting huge volumes of data. Video surveillance footage, WeChat accounts, e-commerce data, medical history, and hotel records: It’s all fair game for the government’s surveillance regime. Yet, taken individually, each of these data streams don’t tell authorities very much. That’s why the Chinese government has embarked on a massive project of data fusion, which merges disparate datasets to produce data-driven analysis. This is how Chinese surveillance systems achieve what authorities call “visualization” (可视化) and “police informatization” (警务信息化). 

While policymakers around the world have grown increasingly aware of China’s mass surveillance regime—from its most repressive practices in Xinjiang to its exports of surveillance platforms to more than 80 countries—relatively little attention has been paid to how Chinese authorities are making use of the data it collects. As countries and companies consider how to respond to China’s surveillance regime, policymakers need to understand data fusion’s crucial role in monitoring the country’s population in order to craft effective responses.  

Data fusion in Chinese surveillance programs 

As China’s population has embraced online life, the Party-state’s mass surveillance practices have evolved from relying on more manual methods—such as dānwèi (单位) work units, the hùkǒu (户口) residency registration system, and dǎng’àn (档案) secret political files—to using technologies that range from the mundane to the cutting-edge. To achieve the goal of “stability maintenance” (维稳), China’s national surveillance programs utilize varying degrees of data fusion. Data fusion is present, for example, in national defense crisis response platforms (国防动员) developed in the mid-2010s that gather data from multiple “thematic clouds,” including e-commerce, tourism, industry, courts, and law enforcement. Other recent programs that rely on data fusion include Sharp Eyes (雪亮工程), the nation-wide Police Cloud (警务云), and Xinjiang’s Integrated Joint Operations Platform (IJOP, 一体化联合作战平台).  

One of the Chinese government’s most prominent data fusion programs is Sharp Eyes, which was launched in 2015 by nine government bodies. The program builds on the infrastructure used by Skynet—a 2005 initiative that focused on surveillance in urban public areas—and extends it into rural areas. Sharp Eyes pulls from a wide variety of data sources. These include surveillance cameras—both privately and government-owned and with and without facial-recognition capabilities—and vehicle and license plate recognition cameras. Public and private video surveillance systems collect facial and other attributes from key locations such as hospitals, schools, entertainment venues, hotels, internet cafes, major road intersections, and storefronts. Sharp Eyes also aims to collect “virtual identities,” such as MAC addresses, phone numbers, and WeChat accounts. 

Authorities ascertain individuals’ identities by first combining the above information with geographic information source (GIS) data and then sending this data to “societal resource integration platforms,” which exist in Xinjiang and at least four other provinces. According to analysis originally published in the journal China Digital Cable TV, a publication supervised by the Ministry of Education and the Ministry of Science and Technology, these platforms combine facial and vehicle recognition data and match it against private and public video sources. GIS data is superimposed on live video feeds to provide granular location data. Multiple companies can be involved in one platform project. For example, one local Sharp Eyes project in Fujian Province uses products from prominent (and blacklisted) AI companies such as Yitu, Huawei, and Hikvision.  

However, Sharp Eyes is not entirely powered by data fusion. In fact, human-centric surveillance is a key design element. In the city of Linyi, where Sharp Eyes was piloted, the local government upgraded citizens’ television cable boxes so they could view surveillance feeds and report crimes by pushing a button on their TV remotes.The Ministry of Justice even provided a patriotic slogan for the effort: “remote control in hand, safety in heart.” (This citizen-centric surveillance strategy originates from the Cultural Revolution, which inspired Sharp Eyes’ name.) As part of Sharp Eyes, mobile apps push video surveillance and public security information to citizens and allow assigned groups of households to report crimes. Command and control centers are staffed by personnel to review footage, take citizens’ reports, and dispatch police accordingly. 

Another national program that uses data fusion is the Ministry of Public Security’s (MPS) Police Cloud, which has been active since 2015. Provincial police cloud-computing centers fuse data from public and private sources, including ID cards, CCTV footage, medical history, supermarket memberships, IP addresses, social media usernames, delivery records, residential addresses, hotel stays, petition records, and biometrics, according to a 2017 report from Human Rights Watch. In a nod to “visualizing” data, the system aims to uncover connections between individuals that would otherwise be difficult for police officers to detect on their own. It supposedly predicts future actions or threats that might cause social instability, such as protests and acts of terror.  

Targeting ‘focus personnel’ 

China’s data-fusion programs allow its surveillance systems to assemble highly detailed portraits of the country’s citizens, but these systems apply particularly severe scrutiny to “focus personnel,” which includes individuals petitioning the government, those purportedly involved in terrorism, and those “undermining social stability.” China’s Uyghur ethnic minority is among those that fall under the “focus personnel” category, and in the Xinjiang region, the center of Uyghur life, this persecuted minority is subjected to intense surveillance. 

One key tool in this surveillance regime is the Integrated Joint Operations Platform (IJOP), which monitors Xinjiang residents with unprecedented intrusiveness. The system flags mundane and otherwise legal behavior as warranting further surveillance, imprisonment, or even extralegal internment in Xinjiang’s vast network of concentration camps. The IJOP functions as a data fusion tool by tying an individual’s government-issued ID card to her physical attributes (such as facial features, blood type, and height), as well as tracking where individuals’ phones, ID cards, and vehicles go. The system collects a variety of data from afar: excessive electricity use, the use of WhatsApp and VPNs, driving someone else’s car. The system also relies on highly intrusive methods. Scattered at strategic locations such as malls and mosques around Xinjiang are what are known as “three-dimensional portrait and integrated data doors” (三维人像综合数据门). These “doors” resemble airport metal detectors and possess facial recognition capabilities, ID card verification, and tools to lift a variety of data from mobile phones, such as MAC addresses, IMEI, IMSI, and ESN numbers. The IJOP also ingests data collected by what have been dubbed “anti-terrorism swords,” which are used at police checkpoints to plug into phones and download their contents, according to an Intercept investigation. IJOP sends push notifications to officers, who can pull aside someone walking through a data door. They can then interrogate individuals and detain or imprison them. In this way, the IJOP can restrict individuals’ movements, which are limited based on what threat level the system determines they fall under.  

China’s military is another player using data-fusion tools in Xinjiang to build predictive policing systems. In 2019, a professor named Cui Yinglong at the People’s Armed Police Engineering University in the regional capital of Ürümqi developed a “dynamic early warning” system, known as the Tianshan Anti-terrorism Cloud (天山反恐云). It is trained on a database he helped build called the “East Turkistan terrorist activity database,” which collects and fuses data based on terrorist methods, objectives, attack dates, and organization from 1990 to the present. Although little information exists on the inner workings of the app, it purports to “accurately depict” terrorists’ “religious, organizational, and behavioral characteristics.” The cloud is designed to provide soldiers with early warning of terrorist activity and combat decision support, and is apparently in active operational use.  

Issues with ‘information islands’  

While it is tempting to conclude that China’s surveillance state is effortlessly and automatically tracking and surveilling every person in the country, its monitoring systems are plagued by human inefficiency, unreliable and incomplete basic data, and incompatible datasets and systems. These inefficiencies have resulted in data silos, also known as “government information islands” (政府信息孤岛), a phrase that refers to isolated data pools that are not adequately shared within government bureaucracies. Currently, data is shared horizontally (across departments and regions) and vertically (within the same organizational entity from the local level up). But scholars have found that horizontal integration bodies suffer from a lack of information that is stove-piped in vertical bodies. As a result, authorities are hindered by bureaucratic systems that prevent more comprehensive data access and greater insight.  

To better execute data fusion, Chinese authorities are now attempting to tackle longstanding issues with data silos, according to a recent ChinaFile analysis of government procurement notices. In 2019, for example, authorities in Beijing looked to build a “Sharp Eyes Video Sharing and Exchange Platform” that would make available video footage from cameras belonging to a range of different departments on a single platform. By integrating footage from approximately 200,000 cameras—and up to 1 million—the platform would improve access to footage and data re-use. To further improve data fusion, future improvements to Sharp Eyes include implementing a standard data mining approach and overcoming technical difficulties and inconsistent standards.  

On the legal front, authorities have also taken several steps to address data silos. While some of this reform preceded the advent of surveillance programs such as Sharp Eyes and the IJOP, scholars Huirong Chen and Sheena Chestnut Greitens note that newer reforms under Xi Jinping coincided with these programs. They include the 2017–18 reorganization of the People’s Armed Police and passing the 2017 National Intelligence Law, which aimed to integrate disparate intelligence and national security authorities. Furthermore, the 13th Five-Year Plan for National Informatization (2016–2020) detailed issues with “information islands,” and called for integrating systems across ministries and departments. In 2017, the Party asked local governments to establish comprehensive information platforms, likely similar to the ones established under Sharp Eyes. 

Policy implications 

Responding to the Chinese government’s use of data-fusion systems to power its surveillance systems represents a difficult challenge. Under the Trump Administration, the U.S. Department of Commerce’s Bureau of Industry and Security (BIS) blacklisted multiple Chinese companies for their human rights abuses in Xinjiang, adding them to the so-called “Entity List.” By being added to the Entity List, these firms should have been cut off from U.S. suppliers, but given the opaque nature of supply chains, it is difficult to assess whether sanctioned firms have been able to access U.S.-originating technologies via workarounds. It also difficult to assess whether homegrown R&D initiatives to develop domestic alternatives to U.S. goods have borne fruit. Furthermore, the vast majority of the companies on the Entity List do not have data fusion as their main line of business; rather, they often provide services that feed into fusion architectures. This means U.S. policy is potentially overlooking a key area of Chinese surveillance systems. (Media and nongovernmental organizations have created lists of Chinese and non-Chinese companies that enable surveillance data fusion practices. This database can aid the BIS in making determinations for its Entity List.) 

On the technical side, governmental organizations, such as the National Science Foundation and the Defense Advanced Research Projects Agency, can fund research to thwart data fusion processes. One emerging line of research seeks to develop privacy-preserving computer vision systems that obscure individuals’ faces. Other lines of research seek to build counter-surveillance technologies, such as specialized clothing, attacking systems through adversarial examples (intentionally destructive inputs to cause model malfunctioning), or data poisoning (modifying training images). Interfering at the stage prior to fusion would not only protect the privacy of those targeted by surveillance systems, but also prevent data fusion processes from accurately functioning.  

It is also important to look at the role of U.S. companies. Chinese surveillance systems are heavily reliant on U.S. firms to provide the gear that powers these digital operations. U.S. suppliers such as Intel, NVIDIA, Cisco, Seagate and Western Digital have all been linked to various aspects of Chinese surveillance systems, but the U.S. government has so far been unable to write rules effectively prohibiting the sales of such equipment. In 2020, the State Department released exhaustive guidance for companies’ export considerations, but the document is nonbinding. The difficulty of imposing binding rules against U.S. companies—along with the continued synergy between the commercial sphere and surveillance states’ technical needs—makes it difficult to prevent the export of such technology. As the United States has staked out a clear policy against surveillance-enabled repression, especially in Xinjiang, it is no longer acceptable for companies to deny knowledge of involvement in supporting China’s surveillance state. One of the most viable options to decrease companies’ more problematic exports remains increased public pressure

Increased attention to Chinese data fusion practices—and its supporting companies—would allow U.S. policy to target China’s surveillance state at a core level, rather than only facial and voice recognition elements that feed into fusion architectures. Taking such steps would better protect Uyghurs, “focus personnel,” and the Chinese people writ large from falling under increasingly abusive, unchecked surveillance.  

Dahlia Peterson is a research analyst at Georgetown’s Center for Security and Emerging Technology.  

Intel provides financial support to the Brookings Institution, a nonprofit organization devoted to rigorous, independent, in-depth public policy research.