Lessons from India’s attempt to marry biometric and voter ID databases

A woman goes through the process of eye scanning for Unique Identification (UID) database system in the outskirts of the western Indian city of Ahmedabad February 13, 2013. The UID database, known as Aadhaar, will help directly transfer budget-busting subsidies to the poor, plugging in leakages in welfare spending. Picture taken February 13, 2013.    To match INDIA-BUDGET/     REUTERS/Amit Dave (INDIA - Tags: BUSINESS SOCIETY)

Over the past decade, the Indian government has assembled a sprawling biometric database designed to improve the delivery of social services to the country’s more than 1 billion citizens. The Aadhaar database is one of the world’s largest biometric identity programs and has been credited with making it easier for Indians to access subsidies and pension payments. Using fingerprints and iris scans, Aadhaar has made it possible for the government to verify the identity of the country’s residents with relative ease. Now, the Election Commission of India wants to link their voter registration database with Aadhaar, a move that would have profound consequences not only for the privacy of Indian citizens but for the future of biometric databases worldwide.

As it stands, the Election Commission of India (EC) stores its voter registration information in its own database and has its own verification tools. However, the Election Commission of India believes Aadhaar can offer increased protections against fraud and registration errors. In August, the Government of India, on behalf of the EC, approached the Unique Identification Authority of India (UIDAI), the body that administers Aadhaar, with a proposal to integrate the two databases. In December 2021, the Lok Sabha passed the Election Laws Amendment Bill, which creates a legal framework for integrating the two systems. Opposition groups argue that the bill will face serious legal hurdles.

The Aadhaar-EPIC controversy illustrates the serious problems that can arise when large biometric identity databases are expanded beyond their remit. Far from making India’s elections more secure, the marriage of the two systems could lead to disenfranchisement and increased voter microtargeting. With countries around the world launching or already administering biometric databases, India’s efforts to marry its biometric identification system to its voter registration database will provide an important precedent for how governments deploy such systems. India’s experience with biometric identification systems should be a lesson for policymakers overseeing similar efforts about the importance of investing in the security of the information ecosystem in which biometric and voting data is housed, how access to this data is regulated and monitored, and how the technology is actually deployed in voter registration and identification.

A global democratic leader

As the world’s largest democracy and an exporter of voting technology, India’s approach to electoral management is likely to influence how other countries run their elections. It was a mere decade ago that then-Secretary of State Hillary Clinton described India’s election commission as the “the global gold standard for running elections” and observed that the commission “is already sharing best practices with counterparts in other countries, including Egypt and Iraq.” Today, India is an exporter of technology used to administer elections, and decisions made in India regarding the administration of domestic elections are likely to influence its peers.

The Aadhaar-EPIC controversy touches on two issues of concern to policymakers and scholars interested in the ways that biometric technologies are being integrated into electoral processes and democratic governance more broadly. The first is the global spread of biometric identity databases. According to the World Privacy Forum, 160 countries collect biometric data for national ID systems. Even when these systems work as intended, critics argue that they become tools of state surveillance creating “risks to privacy and anonymity” and conditioning “citizens into participating in their own surveillance and social control.” And these systems often do not work as intended. Their data ecosystems are often insecure and unregulated, consisting of multiple private and public actors, networks, and databases. This creates opportunities for private actors to access personal data, making them an attractive target for malicious hackers. They can also be exclusionary, placing undue burdens on rural and urban poor to take time off work, travel, and produce papers in order to get IDs. When biometric indicators change, a person’s identity could effectively be lost.

The second issue is the integration of biometrics into voter registration and identification systems. According to the International Institute for Democracy and Electoral Assistance, 50 of the 176 democracies in their database use some form of biometrics to verify the identity of voters. The reasons for using biometrics are straightforward. Ideally, they curb fraud, eliminate multiple registrations, and make elections more secure and efficient. In practice, however, the utility of these systems depends on the context in which they are deployed, such as the independence of a given country’s electoral management body, poll policy and training, civic education and voter confidence, and overall cost. Moreover, election management bodies often do not have the expertise or resources to design and implement their own biometric systems and are thus reliant on private actors to install and manage these systems. These private actors further complicate the information ecosystem in which personal data circulates.

The EC argues that integrating Aadhaar and EPIC is important for two main reasons. First, it will eliminate fraudulent and duplicate registrations, which can bog down electoral administration, slow voting times, and in some cases affect electoral outcomes. Second, it will make it easier for migrant laborers to vote as it would allow them to walk into any polling station, have their identity verified, and then cast a ballot in their home election.

Critics of the move contend that the union of the two systems poses major risks. Civil society actors and journalists have argued that not only is the move unnecessary, but that it could also threaten voter privacy and lead to mass disenfranchisement and fraud. Most recently, a group of 500 prominent citizens and 23 civil society organizations signed a statement decrying the EC’s proposal, calling it “a dangerous idea which can fundamentally alter the structure of our democracy.”

The Aadhaar ecosystem

Launched in 2010 by the Unique Identification Authority of India (UIDAI), Aadhaar was initially designed as a voluntary system for verifying the identities of Indian residents. Explicitly not a citizenship card, Aadhaar aimed to ease access to welfare services such as pension systems, cooking gas subsidies, and income taxes by providing people a simple mechanism for confirming their identity. Enrollees receive a unique 12-digit ID number in exchange for some simple demographic information and two forms of biometric identification: a fingerprint and an iris scan. 

But Aadhaar quickly expanded past its initial remit and became a quasi-mandatory form of identity for accessing social services. Aadhaar’s rapid and largely unregulated expansion has resulted in data leaks, an inability to access government services, and degraded biometrics and loss of identity. Aadhaar’s rapid expansion spurred a series of legal challenges that were resolved in 2018, when India’s Supreme Court validated Aadhaar’s constitutionality but limited its use to certain kinds of welfare programs. Critics of Aadhaar continue to attack the system’s security and data gathering procedures and argue that Aadhaar is a tool of state surveillance that makes it harder for the poor to receive benefits. (For an in-depth history of the Aadhaar controversy see here.)

Researchers have identified myriad issues with Aadhaar but two are particularly relevant to its integration with EPIC. The first is the lack of data standards in Aadhaar’s enrollment process. This process usually involves three actors: the UIDAI, registrars, and enrollment agencies (EA). First, the UIDAI signs a memoranda of understanding with a government agency, a public service undertaking (a company that’s ownership is split between the state and private entities), or other organization granting them the authority to enroll people in Aadhaar. Registrars then farm out enrollment to an EA. These agencies use UIDAI-approved software and biometric devices to register people for Aadhaar. While EAs use certified equipment and software, there is no standardized approach to data collection. EAs have significant discretion as to what kinds of documents they can accept to identify someone for Aadhaar enrollment. This creates possibilities for data-entry errors and corruption in the Aadhaar registration system, causing a host of issues that range from failure to receive pension payments and other welfare benefits to identity theft and the public exposure of personal information.

The second relevant issue is a lack of transparency and accountability in how Aadhaar’s data is handled when it is seeded with other databases. In 2017, for example, in a report for the Center for Internet and Society, Amber Sinha and Srinivas Kodali analyzed publicly available datasets from four schemes seeded with Aadhaar. They found 100-135 million Aadhaar numbers and 100 million bank account numbers disclosed across the four schemes. Incidents such as this emphasize that Aadhaar’s insecure ecosystem creates real-world harms for Indian citizens. The original intent behind Aadhaar was to create a simple system for verifying the identity of Indian citizen’s trying to access government services. However, data leaks and inappropriately handled data have led to many accounts of personal information being exposed. A recent audit of the UIDAI by India’s Comptroller and General Auditor (CAG) found that the organization had failed to properly regulate its client vendors and ensure the security of their data vaults. The report also found many instances of duplicate and incomplete registrations. The audit also criticized the UIDAI for failing to ensure the quality of biometrics and making cardholders responsible for fees associated with updating poorly taken biometrics. “The lack of accountability is an inherent feature of the Aadhaar system,” Apar Gupta of the Internet Freedom Foundation, said in reacting to the audit. “The findings of the CAG audit confirm ground level studies of junk enrollments, faulty and low-quality demographic and biometric data.” The insecurity of the Aadhaar system along with the UIDAI’s lack of accountability should make the Election Commission wary of partnering with them.

Aadhaar, the EC, and mass disenfranchisement

Aadhaar’s insecure ecosystem, lack of data standards, and the UIDAI’s lack of transparency and accountability have led researchers like Vibhav Mariwala and Prakhar Misra to argue that the marriage of Aadhaar and EPIC will exacerbate the principal problem it is intended to solve: voter disenfranchisement and registration irregularities. Mariwala and Misra’s concerns stem not just from the extant issues with Aadhaar, but also from the Election Commission’s first attempt to combine the databases. In 2015, the Election Commission launched its first attempt to marry Aadhaar and EPIC—an initiative known as the National Electoral Roll and Purification Program (NERPAP). NERPAP only operated for a few months before being halted by a Supreme Court decision that limited the use of Aadhaar to four specific welfare schemes. In the brief period it was operational, NERPAP linked the registration information of 320 million voters to their Aadhaar number—but also disenfranchised 3 million voters.

In the aftermath of NERPAP, controversy arose over whether or not the EC had actually received consent from voters to link their EPIC data with Aadhaar. At the time, the Election Commission claimed that the only mechanism for linking the two systems was the National Voters Service Portal. When voters logged in, they could voluntarily choose to link their accounts to Aadhaar. Four years later, these claims were challenged when over 3 million voters showed up to cast ballots in the state of Telangana only to find their names deleted from the polls. In the ensuing scandal, multiple right to information requests filed after the election in four states revealed that the EC pursued several tactics to rapidly seed the EPIC database with Aadhaar. According to reports in and The Wire, these tactics included accessing other national databases, enlisting local election officials to gather Aadhaar numbers during registration processes, and using the UIDAI’s DBT Seeding Data Viewer (DSDV) tool, which allowed the EC to search Aadhaar records and view non-biometric Aadhaar data side by side with voter ID information. In each of these cases, consent was sketchy at best and was certainly not attained in the straightforward way that the EC claimed.

The controversy also forced the EC to admit that voter names were deleted during NERPAP in 2015. In this case, the software that was used to link Aadhaar and EPIC deleted supposedly duplicate voters without verifying that they were in fact duplicate registrations. In the ensuing controversy, the EC did not reveal how its software identified duplicate registrations and insisted that despite the mass deletion, this type of de-duplication process needed to be carried out across the country. Critics of the government’s effort to marry Aadhaar and EPIC fear that another such mass deletion could take place if the government is allowed to once more attempt to combine the two databases. While it is difficult to gauge the likelihood of another mass deletion, the Election Commission’s lack of transparency about the first linkage, their continued unwillingness to spell out how EPIC will be connected to Aadhaar certainly raises red flags. These concerns are heightened by the serious security concerns plaguing the Aadhaar ecosystem.

Aadhaar, EPIC, and microtargeting

As it stands, voter data in India is easily available online, but because it isn’t machine readable, it is difficult to microtarget voters. The lack of machine-readable data is exacerbated by the fact many people have the same name, and sometimes those names are spelled differently across different databases. Marrying EPIC and Aadhaar would solve both these problems. As Anuj Srivas writes in The Wire, it would make matching one’s voter information with information in a wide array of other databases easy because Aadhaar allows identification across databases. The marriage of the two databases could thus lead to increased opportunities for microtargeting by the sitting government.

The possibility of microtargeting in India would seriously threaten the secrecy of the ballot. Each polling station in India serves approximately 3000 voters, and polling stations have to be within two kilometers of a voter’s home—making it fairly easy to determine whom a given voter cast their ballot for if microtargeting data is available. Moreover, microtargeting at this level could allow government actors to direct specific policies toward groups of local beneficiaries, as Aadhaar is primarily used to deliver government services. For example, these techniques could potentially allow a ruling party to target welfare schemes and infrastructure projects to specific polling communities. This might sound far-fetched, but in the run-up to the 2019 general election, the ruling BJP openly engaged in poll-based, benefit-focused campaigning. If the BJP were able to further tailor this kind of campaigning to smaller communities, it would allow them to consolidate their base and gain new voters.

Researchers warn that political microtargeting could result in a loss of privacy and exposure to selective information, providing fertile ground for mis- and dis-information to spread and polarization to increase. The fact that voters are typically unaware that they are being targeted undermines their ability to determine which information is relevant to them. Private companies often carry out microtargeting on behalf of political parties, and as digital intermediaries, the companies are not subject to the transparency and accountability mechanisms assigned to traditional political actors. It is important to note, however, that the actual efficacy of political microtargeting is debatable. As Jessica Baldwin-Philippi, the political communications scholar, notes, it is important to separate theoretical concerns about microtargeting from analyses of their actual impact. With that distinction in mind, marrying EPIC with Aadhaar has the potential to facilitate problematic forms of microtargeting. Or as Retired Supreme Court Justice, B.N. Srikrishna provocatively summed up these concerns: “Instead of having a Cambridge Analytica you’ll have a Delhi Analytica, a Mumbai Analytica, a Calcutta Analytica. That is the danger.”


The EPIC-Aadhaar controversy has serious implications for global actors interested in identification technologies, e-governance, and electoral processes. First, it provides additional evidence that national biometric identity databases are at best problematic, especially when they expand far past their initial remit. Second, it adds a list of contextual factors that need to be considered when gauging the utility of using biometrics in electoral processes. These factors include the security of the information ecosystem in which biometric voting data is housed, how access to this data is regulated and monitored, and how the technology is actually deployed in voter registration and identification. Finally, the introduction of biometrics into electoral processes could lead to their integration into the actual voting process. For example, the EC revealed that it was developing blockchain voting technology with the Indian Institutes of Technology in Madras and Chennai for use as early as the 2024 general election. Blockchain voting technology would use biometric data to confirm voter’s identity, meaning that they would need an Aadhaar number to cast their ballot. Using biometric data in this way would heighten concerns about mass disenfranchisement and electoral transparency.

Patrick Jones is a scholar of emerging media and digital technologies and received his PhD from the University of Oregon in 2020.