This article is part of a three-part series on how to address federal privacy legislation in the United States. The recommendations in this article, along with the two previously published sections on preemption and a private right of action, are adapted from a June 2020 Brookings Institution report, “Bridging the Gaps: A Path Forward to Federal Privacy Legislation.” Cameron F. Kerry—along with John B. Morris Jr.—lead The Privacy Debate initiative at the Brookings Institution, which brings together stakeholders in civil society, industry, government and academia to discuss federal privacy legislation. A version of this article originally appeared on the Lawfare blog.
The intersection of privacy and civil rights is a relatively recent development in the privacy debate. But it takes on new saliency amid a national outcry about the heavy toll of racism on the lives of Black people and other people of color.
In the Brookings Institution report on which this series of posts is based, we recommend that privacy legislation address algorithmic discrimination because the use of personal information in ways that disadvantage individuals implicates an essential element of privacy. This post explains why and how.
How Civil Rights Fit Into Privacy
Concerns that personal information might be used in ways that perpetuate or exacerbate bias have become more prominent as predictive analytics, machine learning and artificial intelligence magnify the power to use personal information in granular ways. As a White House task force concluded in 2014, while data can be used for good, it “can also be used in ways that perpetuate social harms or render outcomes that have inequitable impacts, even when discrimination is not intended.”
In the contexts of policing and criminal justice, this concern intensifies. The death of George Floyd is only one of the most recent examples of the disproportionate impact of systemic discrimination on the lives of disadvantaged people, especially Black Americans.
This concern also applies to more subtle impacts in the commercial arena where hidden proxies can operate to reduce opportunities for Black Americans and other disadvantaged populations. In a stark example, Latanya Sweeney, a Harvard professor and former Federal Trade Commission (FTC) chief technology officer, demonstrated that online searches using names associated with African Americans were more likely to generate advertisements relating to arrest records and less favorable credit cards.
Civil rights present three threshold questions for privacy legislation. The first is how the issue fits into privacy legislation. There is a body of existing anti-discrimination law and jurisprudence, on which specialized enforcement agencies like the Equal Employment Opportunity Commission (EEOC) have decades of experience applying and adjudicating. Statutes such as Title VII of the Civil Rights Act or the Americans with Disabilities Act are largely outside the experience and mandate of the FTC. Meanwhile, discrimination presents novel issues in the information privacy context. These circumstances raise questions as to what a nondiscrimination provision in a privacy statute can add to existing law—and to what extent it should.
The second question addresses the charged politics that surround civil rights. Any enlargement or contraction of existing rights and remedies cuts across polarized social issues that are subjects of electoral trench warfare and beyond the effective reach of privacy legislation. This is especially the case for anything that enlarges categories of individuals protected under federal law.
The weight of this cargo is magnified by the number of congressional committees with oversight of civil rights laws. In the Senate, these consist primarily of the Judiciary Committee; Banking, Housing, and Urban Affairs Committee; and Health, Education, Labor, and Pensions Committee—but not the Commerce, Science, and Transportation Committee, which oversees the FTC and is the main committee on privacy legislation. Congressional committees guard their jurisdictions forcefully, and Metcalfe’s law applies here: The complexity of the path forward increases exponentially with the number of nodes it touches.
The third question is about the algorithms themselves. They are complex and opaque, and machine learning development outpaces human understanding. Numerous studies, reports, ethical frameworks and other analyses have identified issues, risks and benefits of algorithmic predictions and propounded various practices to avoid erroneous, discriminatory or otherwise undesirable outcomes. Even so, a generally applicable prescription for preventing and identifying algorithmic discrimination is a work in progress.
These issues counsel against overreach, but not sidestepping discrimination issues altogether. As I have written previously about artificial intelligence, “Use of personal information about [attributes such as skin color, sexual identity, and national origin], either explicitly or—more likely and less obviously—via proxies, for automated decision-making that is against the interests of the individual involved thus implicates privacy interests in controlling how information is used.” This makes discriminatory use of personal information an appropriate subject for federal privacy legislation. Seen in relation to the use of personal information, the pertinent injury is not the discrimination as such, but the use of such information in ways that are against the interests or contextual expectations of an individual linked to that information.
Moreover, the discrimination covered by the Civil Rights Act of 1964 and its progeny envisioned human agency—decisions by restaurant owners, landlords and other people. But in the 21st century, decisions can be made by machines or software—without a human in the loop. Deconstructing the basis for these decisions is a difficult undertaking of a different order from traditional employment or housing discrimination cases. This task requires new tools, and privacy legislation can help supply them.
Civil Rights in Federal Privacy Legislation
Going into 2020, the two leading privacy bills were the draft United States Consumer Data Privacy Act (USCDPA) from Republican Sen. Roger Wicker and the Consumer Online Privacy Rights Act (COPRA) from Democratic Sen. Maria Cantwell, respectively the chair and the ranking minority member of the Committee on Commerce, Science, and Transportation.
These bills show some degree of bipartisanship on the issue of algorithmic discrimination: They agree it has a place in privacy legislation and that the FTC should conduct a study of the discriminatory use of algorithms. They differ, though, on the role of both legislation and the FTC in addressing such discrimination as well as how closely to monitor algorithmic decision-making.
In our report, we propose an approach that combines provisions from both bills in ways that accomplish apparent objectives of both. But privacy legislation could do more than either bill to emphasize anti-discrimination obligations and distinguish algorithmic accountability from nondiscrimination. Hence, we suggest additional provisions to address discriminatory data use.
Duties of Loyalty and Care
In our prior post, we mentioned the COPRA provision entitled “duty of loyalty” that prohibits “harmful data practices” (Section 101(b)(2)), or practices that cause concrete and recognized harms—financial injury, offensive intrusions on individual privacy and “other substantial harms.” To broaden this duty of loyalty, we recommend combining subsequent COPRA and USCDPA limitations on data collection with a broad duty to process personal data “in a manner that respects the privacy of individuals” and “in accordance with law.” The resulting obligations would scale appropriately to the size and complexity of the entity involved and the nature and extent of the data use.
In our report, we also propose a distinct “duty of care” that would prohibit entities from reasonably foreseeably causing specified harms—like those enumerated in COPRA—and link to standards of liability in our recommended limited and tiered private right of action. Among the harms subject to this duty, we would expressly prohibit entities from processing or transferring data in a manner that could cause “discrimination in violation of the Federal antidiscrimination laws or the antidiscrimination laws of any State or political subdivision thereof applicable to the covered entity.” Like the other harms addressed in this proposed duty, discrimination is wellestablished in existing law.
These measures would incorporate nondiscrimination and compliance with anti-discrimination laws into a set of baseline duties to take into the account the interests of individuals.
FTC Role in Federal Nondiscrimination Laws
The USCDPA provides for an indirect anti-discrimination role for the FTC, with the agency to “endeavor” to refer “information that any covered entity may have processed or transferred covered data in violation of Federal anti-discrimination laws” to relevant federal or state agencies authorized to enforce these laws, and to cooperate with these agencies.
In turn, COPRA makes discrimination a violation of the FTC Act by declaring:
A covered entity shall not process or transfer data on the basis of an individual’s or class of individuals’ actual or perceived race, color, ethnicity, religion, national origin, sex, gender, gender identity, sexual orientation, familial status, biometric information, lawful source of income, or disability
in any way that “unlawfully discriminates against or otherwise makes the opportunity unavailable” in housing, employment, credit or education opportunity, or that “unlawfully segregated, discriminates against, or otherwise makes unavailable” any public accommodation.
This language, incorporating a legislative proposal by Free Press and the Lawyers’ Committee for Civil Rights Under Law, substantially tracks the Title VII of the Civil Rights Act of 1964. Most of the protected categories in COPRA are the subject of existing federal anti-discrimination laws, although COPRA notably omits age. Some categories in COPRA, though, are not covered by existing laws. “Biometric information,” as defined in COPRA, includes genetic data covered by the Genetic Information Nondiscrimination Act—but also encompasses other characteristics not covered by existing laws. Familial status is currently protected only in the context of housing under the Fair Housing Act. As a result of the Supreme Court’s latest Title VII decisions, sexual orientation and gender identity are now covered as a dissemination of “on the basis of sex.”
We see the two proposals as consistent with emerging concerns about algorithmic discrimination and the substantive, jurisdictional and political challenges of fitting civil rights provisions into an information privacy law. The USCDPA reflects that the primary authority and expertise for enforcement of federal anti-discrimination laws rests with the agencies designated by those laws. It is possible both to maintain the role of the EEOC and other federal agencies and to make discriminatory uses of personal data a violation of privacy law and the FTC Act—as COPRA proposes. With this authority, the FTC can play an adjunct role in nondiscrimination enforcement.
Thus, we recommend combining a version of COPRA’s anti-discrimination provision with the USCDPA’s provision on FTC referrals to other agencies. This would maintain the primary role of existing enforcement agencies, while giving the FTC authority to apply its expertise in technology, data and algorithm use developed over the past decade as a force multiplier as well as to inform understanding of algorithms and their effects. Such anti-discrimination provisions would flesh out the provision in our proposed duty of care that prohibits discrimination in ways that violate federal anti-discrimination laws. Under a privacy statute, the gravamen of a violation would not be the discrimination as such, but the use of covered data in ways that are harmful to an individual.
Shifting the Burden of Explaining the Algorithm
We also suggest several changes to COPRA’s anti-discrimination language both to address how algorithmic discrimination differs from that prohibited by current federal anti-discrimination statutes and to hew more closely to current and future federal laws. First, instead of using “on the basis of” protected classifications, like COPRA and the Free Press and Lawyers’ Committee, we propose the provision prohibit both data processing and transfer “that differentiates an individual or class of individuals.” This language adapts to changes in the nature of decision-making by shifting the focus from the decision to the output of an algorithm.
Further, we suggest that the COPRA provision apply to differentiation “with respect to any category or classification protected under the Constitution or law of the United States as they may be construed or amended from time to time” rather than enumerate specific protected classes. Besides mirroring our recommended duty of care language, this change avoids limiting protected categories to those mentioned in the statute and leaves resolution of these questions to ongoing legislative, judicial and political debate. Under a canon of statutory construction, courts treat statutory references to existing laws as the laws in effect at the time of enactment unless there is specific language to indicate otherwise. The effect of this language would be to encompass sexual orientation and gender identity within a federal privacy law by virtue of the Supreme Court’s recent Title VII decisions.
Finally—and most significantly—we also recommend importing, in a modified form, a provision from a House Energy and Commerce Committee staff discussion draft that would provide a more thorough airing of the workings of contested algorithms: a disparate impact provision entitled “burden of proof” (Section 11(c)). It provides that:
If the processing of covered information … causes a disparate impact on the basis of any characteristics [protected under previous provisions], the covered entity shall have the burden of demonstrating that—
(A) such processing of data—
(i) is not intentionally discriminatory; and
(ii) is necessary to achieve one or more substantial, legitimate, nondiscriminatory interests; and
(B) there is no reasonable alternative policy or practice that could serve the interest described in clause (ii) of subparagraph (A) with a less discriminatory effect.
This provision is modeled on the Civil Rights Act of 1991, which, like other civil rights legislation discussed above, is a product of predigital times when issues more directly involved the intent of people and the organizations for which they acted. To focus on algorithms instead, we suggest replacing “causes a disparate impact on the basis of” with the language we suggested earlier: “differentiates an individual or class of individuals with respect to any category or classification protected under the Constitution or law of the United States.”
The rebuttal showing should be adapted to algorithms as well. If algorithms or artificial intelligence exercise significant control with limited human oversight, familiar methods of gauging intentionality fit poorly. In this context, an “intentionally discriminatory” standard is difficult to apply. Instead, to focus on data analytics, we propose requiring the covered entity to demonstrate that its data processing is “independent of any protected characteristic or classification.” Similarly, the terms “policy or practice” do not really fit algorithmic decision-making, even though algorithms may have policies or practices embedded in certain instructions; thus we suggest replacing this provision with “there is no reasonable alternative method of processing.”
When algorithms result in disparate impact, it is appropriate to shift the burden of proof to the party that employs or operates the algorithmic decision-making system. Most such burden-shifting mechanisms rest significantly on the superior knowledge and access to information of the employer or other party assigned the burden. This information disparity is overwhelmingly and uniformly the case for any algorithmic decision-making. Moreover, the prima facie showing based on “differentiating” is, in effect, a disparate impact showing and, without the rebuttal standard, the disparate impact prima facie test would become per se discrimination. A rebuttal standard is thus a necessary corollary of a shift in focus to algorithmic decision-making.
Without this burden-shifting, a black box could provide impunity, encouraging willful ignorance on the part of covered entities that employ algorithmic decision-making. This would work against an array of recommended practices for ethical and responsible use of algorithmic decision-making and accountability. “The algorithm did it” should not be a sufficient defense in a discrimination case.
Other Algorithmic Accountability
In addition to the provisions discussed above that directly address discrimination resulting from data use, requiring accountability and transparency in uses of algorithmic decision-making can help prevent and identify discriminatory outcomes as well as address other concerns about the impact of algorithmic decision-making on individuals and society. COPRA and the USCDPA both include algorithmic impact assessment provisions.
In its civil rights section, COPRA calls for annual “algorithmic decision-making impact assessment[s]” if a covered entity engages or assists others in algorithmic decision-making and the algorithmic decision-making system is used to advertise housing, education, employment, credit or access to public accommodations. Such assessments must evaluate the design and data used to develop the algorithms; describe testing for accuracy, fairness, bias and discrimination; and assess whether the algorithms produce discriminatory results in areas protected by federal anti-discrimination laws. The FTC, in turn, is directed to publish a report on algorithm use and civil rights every three years after enactment.
The USCDPA has a parallel provision on “algorithm bias, detection, and mitigation,” which is contained within its Title II on “data transparency, integrity, and security.” The bill calls for the FTC to issue “algorithm transparency reports” that examine “the use of algorithms to process covered data in a manner that may violate Federal anti-discrimination laws” and to develop guidance on “avoiding discriminatory use of algorithms.”
Each bill also has a related section on privacy impact assessments. The USCDPA requires such assessments only for “large data holders”—which it defines as entities that process covered data from more than 5 million individuals or devices, or sensitive covered data from more than 100,000 individuals or devices. COPRA would require all covered entities—except for small businesses exempt under COPRA—to appoint privacy and security officers responsible for conducting annual risk assessments.
Impact assessments are vital tools to support civil rights in the context of personal data processing, but they should consider broader harms than those covered by federal anti-discrimination laws. In turn, a record of algorithmic risk assessments and FTC reports on discrimination or other effects of algorithms can inform future debate about forms of discrimination and the broader risks or benefits of algorithmic decision-making. We therefore concur with the USCDPA’s approach in putting algorithmic impact assessments in a separate section from civil rights provisions and in a title on obligations of the entities that collect and use personal information. We also concur with the USCDPA that algorithmic decision-making impact assessments should be mandatory only for large data holders.
We depart from both bills in proposing to broaden the scope of algorithmic decision-making impact assessments for large data holders to include both an initial risk assessment prior to deploying algorithmic decision-making and annual audits of the results after deployment to evaluate outcomes.
Both COPRA and the USCDPA prescribe their respective obligations for algorithmic decision-making impact assessments and privacy risk assessments annually. Although annual assessments can provide an important ongoing factual record, the effect would be that most entities would conduct assessments after the fact, alongside whatever other algorithmic decision-making they deploy over the course of the year. Because advance thought on the impact of algorithmic decision-making is essential to avoid undesirable outcomes, we recommend requiring large data holders to conduct impact assessments when “considering” using a new algorithmic decision-making system—not just within one year.
We also suggest broadening the types of algorithmic decisions covered. COPRA lists only five specific categories: housing, education, employment, credit and public accommodations. All of these clearly can have a major impact on people’s lives. To protect individuals and enlarge understanding of the impact of algorithmic decision-making, we believe the trigger to conduct algorithmic decision-making impact assessments should be when covered entities consider using systems that more broadly “may have a significant effect on individuals”—similar to provisions on “automated individual decision-making” in the EU’s General Data Protection Regulation.
Targeting algorithmic decision-making impact assessments to large data holders would not free smaller entities from any responsibility for the effects of their algorithms because, as we conceive it, all covered entities would still be subject to the duty of care—which would mandate nondiscrimination—and to a baseline obligation to consider privacy risks and benefits to individuals.
Privacy Is a Civil Rights Issue
Machines should not have a license to discriminate where humans cannot. Nondiscrimination provisions belong in privacy legislation because the use of personal information to discriminate implicates interests at the core of what privacy should protect—controlling how personal information is used and ensuring that this information is not used against the interests of individuals.