Privacy under siege: DOGE’s one big, beautiful database

The Department of Government Efficiency (DOGE), according to reporting in the Washington Post, recently set its sights on creating “a single centralized [government] database” that would enable broad access across government agencies to vast amounts of information currently collected and held by individual federal agencies.

Government data aggregation and unification on this scale is antithetical to the purpose-driven requirements for data sharing among government agencies that lie at the heart of the Privacy Act, a 1974 law passed in the aftermath of Watergate and the FBI’s Counterintelligence Program (COINTELPRO) scandals. Then-Senate Judiciary Chairman Sam Ervin, the Privacy Act’s primary sponsor, recognized that one of Watergate’s primary lessons was the need to place “limits upon what the Government can know about each of its citizens.”

The Privacy Act established “a code of fair information practices that governs the collection, maintenance, use, and dissemination of information about individuals that is maintained in systems of records by federal agencies.” Systems of records is defined as “a group of any records under the control of any agency from which information is retrieved by the name of the individual or by some identifying number, symbol, or other identifying particular assigned to the individual.” Federal agencies are prohibited from disclosing Privacy Act covered records absent written consent of the individual to whom the records pertain, unless the disclosure is pursuant to one of the act’s 12 statutory exceptions. Those exceptions include disclosures to other officials within the agency “who have a need for the record in the performance of their duties” or disclosures that constitute a “routine use.” Significantly, the act allows agencies to define “routine uses” for each system of records that allow for disclosures without individual consent. The statutory standard—compatibility with the purpose for which the record was collected—is vague and thus supports broadly defined disclosures established by agencies.

The current DOGE data merger enterprise tracks efforts by DOGE, some successful, to gain access to and integrate information from multiple government agencies. According to a letter from recently deceased Rep. Gerald Connolly (D-Va.), the former ranking member of the House Committee on Oversight and Government Reform, a whistleblower reported that “DOGE engineers have tried to create specialized computers for themselves that simultaneously give full access to networks and databases across different agencies” and “have assembled backpacks full of laptops, each with access to different agency systems, that DOGE staff is using to combine databases that are currently maintained separately by multiple federal agencies.”

DOGE’s efforts to build a centralized government database comport with President Trump’s executive order requiring government agencies to break down “information silos,” to the “maximum extent consistent with law,” for the purpose of “eliminating bureaucratic duplication and inefficiency while enhancing the Government’s ability to detect overpayments and fraud.” These efforts, it would appear, are part of the current administration’s stated goals of improving government efficiency and eliminating waste, fraud, and abuse. When framed broadly and abstractly, these are worthy goals—executing changes to government operations, consistent with the law, that increase efficiency and cut waste can improve the federal government’s service to the public.

In this instance, however, there is a danger in accepting the rhetoric of efficiency and elimination of waste, fraud, and abuse at face value. This rhetoric obfuscates the substantial growth in government surveillance powers this project incidentally or intentionally promotes. The creation of a single, centralized government database will allow for aggregation of and broad government access to extremely sensitive data including Social Security numbers, federal tax returns, health records, birth dates, and income, asset, and demographic information.

Should this database be built, it would constitute an enormous expansion of government surveillance powers that can be directed toward more efficient targeting of individual Americans.

Back to top

Total information awareness

This is not the first time the federal government has pursued a system capable of integrating and searching vast and multiple types of transactions, records, and data across different sources and databases. In the wake of the Sept. 11, 2001, attacks, the Defense Advanced Research Projects Agency (DARPA) proposed a program called “Total Information Awareness” (TIA), later renamed the Terrorism Information Awareness program.

TIA was “a research and development program intended to counter terrorism through prevention.” As one of us described in a book chapter about U.S. collection of data,¹ back in 2002, John Poindexter, retired admiral and director of DARPA’s Information Awareness Office, identified the “transaction space” as one “significant new data sourc[e] that need[ed] to be mined to discover and track terrorists.” This space included data encompassing communications, financial, education, travel, medical, veterinary, country entry, place/event entry, transportation, housing, critical resources, and government records. Part of the plan was for “Red Teams” to develop model attack scenarios, then determine the types of transactions necessary to carry out such attacks. These transactions would form patterns discernable in databases to which the government would have lawful access. Having developed targetable patterns of attack precursor behavior, the government could then search across databases, some of which would involve access to data held by the private sector, to detect the presence of those patterns.

At the time, the American Civil Liberties Union called TIA “the closest thing to a true ‘Big Brother’ program that has ever been seriously contemplated” by the U.S. government. The government, for its part, appeared to believe new legislation amending the Privacy Act would have been needed to allow for at least some of the data sharing and integration envisioned by the program. In response to “intense public controversy,” Congress terminated funding for the program in 2003, although some research aspects were transferred to another group working with the National Security Agency.

Regardless of one’s position on the TIA program or Congress’ decision to terminate its funding, it is notable that the government, following the 9/11 attacks, articulated a specific purpose and justification for the program—analyzing data to detect patterns that it believed would help it prevent terrorist attacks. Poindexter argued for new and enhanced government surveillance powers and emphasized the need to “‘break down stove pipes’ that separate commercial and government databases” to allow “teams of intelligence agency analysts to hunt for hidden patterns of activity.” This focus and articulation of purpose stands in sharp contrast to the generalities offered in support of DOGE’s efforts to create one centralized government database. Elon Musk, the former leader of DOGE, has, for example, stated that the “biggest vulnerability for fraud” comes from the fact that government “databases don’t talk to each other.”

Back to top

Big potential Privacy Act infidelities

DOGE’s efforts to access data at various government agencies include, but are not limited to: the Treasury Department’s payment system that contains Social Security numbers, federal tax returns, home addresses, and birth dates; the Office of Personnel Management’s (OPM) system, which stores background checks, medical and bank account information, and biometric data of federal employees; along with the systems of the Social Security Administration (SSA), the Education Department, the Labor Department, and the Department of Health and Human Services, all which store equally sensitive personal data. As of this writing, there are at least 12 lawsuits alleging Privacy Act violations that involve the disclosures of agency records to DOGE, among other claims.

At the heart of these cases are questions about whether access by or disclosures to DOGE violate the Privacy Act in three significant ways. First, if the relevant individuals from DOGE are employees of the agency maintaining the records, did they “have a need for the record[s] in the performance of their duties?” Although courts have historically found a range of intra-agency disclosures satisfy this exception, the Privacy Act itself does not define the term “need.” To determine if a “need to know” exists, courts have considered “whether the official examined the record in connection with the performance of duties assigned to him and whether he had to do so in order to perform those duties properly.” Another potentially relevant question related to the “need to know” inquiry is whether an agency official really needs identifiable records to meet the purpose. As Justice Ketanji Brown Jackson notes in her dissent in the ongoing Privacy Act litigation over DOGE access to SSA information, “[r]ecord evidence reflects that DOGE received far broader data access than the SSA customarily affords for fraud, waste, and abuse reviews,” whereas “similar investigations typically ‘start with access to high-level, anonymized data based on the least amount of data the analyst or auditor would need to know.’”

Second, if the relevant individuals from DOGE are not employees of the agency maintaining the records, then was the disclosure properly established as a “routine use,” which the statute defines as a purpose “compatible with the purpose for which [the record] was collected.” The term “compatible” is not further defined and is assessed on a case-by-case basis. OPM guidance characterizes “compatibility” as “(1) functionally equivalent uses and (2) other uses that are necessary and proper.” Courts have held that routine use disclosures enabling an agency to “fulfill its mission are ‘compatible’ disclosures.” To establish a routine use, the Privacy Act requires agencies to publish notice in the Federal Register in order to provide constructive notice to the public and to accept comments. For many new activities, the act also requires publication of updated notice of agency systems as well as notice to Congress.

Third, have these disclosures occurred without appropriate safeguards to ensure the security and confidentiality of the relevant records? The statute requires agencies to “establish appropriate administrative, technical, and physical safeguards to insure the security and confidentiality of records and to protect against any anticipated threats or hazards to their security or integrity which could result in substantial harm, embarrassment, inconvenience, or unfairness to any individual.” In one current lawsuit involving OPM records, the American Federation of Government Employees alleges that “OPM Defendants did not establish security vetting and security training for the DOGE Defendants before they were given . . . access [to the records].” Of particular concern to these plaintiffs is the fact that an “OPM data breach disclosed in 2015 affected over 22 million people and led to identity theft and fraud.”

The litigation in these matters is ongoing. Whether courts will find that Privacy Act violations have occurred is yet unknown.

On an even broader scale, integrating and storing all the aforementioned types of data (and more) in one centralized government database raises core Privacy Act concerns. As Congress recognized when it passed the Privacy Act in 1974, “the privacy of an individual is directly affected by the collection, maintenance, use, and dissemination of personal information by Federal agencies” and “the increasing use of computers and sophisticated information technology, while essential to the efficient operations of the Government, has greatly magnified the harm to individual privacy that can occur from any collection, maintenance, use, or dissemination of personal information.” And as should be evident from the 2015 OPM cyber intrusion, the collection and storage of massive amounts of sensitive personal information translates into one “big, beautiful” target for hackers and foreign adversaries.

It is not clear that the Privacy Act, as currently written, is up to the task of preventing the creation of a single centralized government database integrating agency records if the government chooses to follow the act’s established procedures. That is, there are ways for the government to follow the letter, if not the spirit, of the Privacy Act in service of creating a centralized database.

Arguably, a federal agency could establish a new system of records with the purpose, for example, of preventing waste, fraud, and abuse and then publish in the Federal Register “a notice of the existence and character of the system of records,” commonly known as a system of records notice (SORN). Other agencies would also need to publish a notice “of any new use or intended use of the information” from their systems and provide an opportunity for the public to submit written views. To enable the lawful disclosure of the records kept by other agencies to the agency maintaining the centralized new system of records, the other individual agencies would each need to publish “routine use” notices in the Federal Register prior to making the disclosures to that centralized system. Again, it is arguable that these routine use notices could meet the statute’s compatibility standard insofar as they would show that the intended disclosures allow each agency to fulfill the broadly stated and not entirely unreasonable goal of stopping fraud, abuse, and waste. But one could easily argue otherwise, especially given the history behind the Privacy Act.

There are other laws—the Paperwork Reduction Act, E-Government Act, and the Federal Information Security Modernization Act—that place additional procedural and substantive requirements on agencies that collect, use, and disclose personally identifiable information. And still there may be other ways to challenge the creation of a single centralized database through, for example, substantive laws that govern specific federal government programs and that cabin the use of personal information. But so long as the government complies with all the procedural requirements of the Privacy Act and other laws, the procedural challenges that many lawsuits raise today may not prevent the creation of a single centralized database of government records.

Back to top

Scrutinizing claims of efficiency that cloak the expansion of government surveillance powers

In their article “American Panopticon,” the Atlantic’s Ian Bogost and Charlie Warzel consider what the Trump administration might do with a central, integrated database of government records. They begin by highlighting some categories, certainly not all, of the sensitive, revelatory data stored in government databases, along with some ways the government uses the data. The FBI, for example, has a “facial-recognition apparatus” containing more than 640 million photos, including driver’s license and passport photos, along with mug shots. The SSA maintains “individual earnings histories for each of the 350+ million Social Security numbers that have been assigned to workers.” The Department of Veterans Affairs keeps “mental-health information on former service members, including notes from therapy sessions, details about medication, and accounts of substance abuse.” The National Institute of Standards and Technology created a “limited tattoo database,” which was provided to multiple agencies to train software systems to recognize tattoos commonly associated with gangs. Whistleblower information is also contained in government databases. Moreover, multiple government agencies including the IRS, FBI, Department of Homeland Security (DHS), and Department of Defense (DoD) purchase location data from data brokers, enabling the government to discern and map past movements of American citizens.

What is the administration’s stated purpose for merging all of these (and other) separate agency databases? Beyond the Trump executive order’s very general edict to eliminate, consistent with the law, government “information silos” and other assertions about alleged waste, fraud, and abuse, no specific articulation of the alleged problems or inefficiencies that the creation of one big government database is meant to address has been offered, nor has an explanation for how DOGE will create this database and comply with the Privacy Act. Bogost and Warzel report that Harrison Fields, a spokesperson for the White House, “confirmed that DOGE is combining data that it has collected across agencies, but he did not respond to individual questions about which data it has or how it plans to safeguard citizens’ private information.”

Interviews conducted by Bogost and Warzel with former federal government employees reveal a number of ways the government could weaponize a centralized database, particularly when aided by advancements in artificial intelligence (AI) technology. The government in its role as debt collector could punish those who struggle to repay federal loans “beyond what’s possible now, by having professional licenses revoked or having their wages or bank accounts frozen.” The government or its private sector partners, should the data be shared, could sharpen capabilities for targeting groups of the population “based on a supposed attribute or trait.” Information from “background checks or health studies” could be used to identify and “punish people who have seen a therapist for mental illness.” Public benefits to those who have ever reached a certain income threshold could be terminated under the theory that those who once made a high salary don’t need public benefits. Combining government data with purchased location data would allow for “inferences about actions, activities, or associates of almost anybody perceived as a government critic or dissident.”

Simply put, the government could target individual Americans “at scale.” While Bogost and Warzel stress that the scenarios discussed are hypothetical, the administration’s “current use of combined data in service of deportations” along with its failure to offer credible evidence of alleged gang association or evidence of criminal convictions for some of those deported suggests a willingness “to use these data for its political aims.”

To the extent that this merger project is seen or represented as part of DOGE’s pursuit of government efficiency and the elimination of waste, fraud, and abuse, such speculative rationalizations do little more than mask the growth in government surveillance powers that a single centralized database portends. To date, the administration seems unconcerned about the Privacy Act’s requirements or commitments, forcing various sets of plaintiffs to file lawsuits. And DOGE’s actual contributions to improving government efficiency or curbing waste, fraud, and abuse appear questionable. Case in point, when DOGE set up new anti-fraud checks at the SSA relating to certain kinds of claims made over the telephone, only two out of 110,000 benefit claims were flagged as having “a high probability of being fraudulent” and “[l]ess than 1% of claims were flagged as even potentially fraudulent at all.” What the anti-fraud tool did accomplish, however, was to slow “retirement claim processing by 25%” following “weeks of changes to the agency’s telephone policies,” ultimately leading to a “degradation of public service,” as reflected in an internal document obtained by Nextgov/FCW “that examined potentially cutting the anti-fraud tool for phone claims.”

Once created, a single centralized government database will be difficult to dismantle. As the saying goes, you can’t unring a bell. But the most significant and ominous government efficiencies to be gained could not only appear in the form of enhanced efficacy in the targeting of immigrant populations, who are already under siege, but in equally sharpened government scrutiny of every individual American citizen for any reason the government desires.