How privacy legislation can help address AI

The ChatGPT moment has captured attention like nothing before in the digital era. This is evident in the record-breaking uptake of ChatGPT as well as the wide public conversation, not only from almost every imaginable commentator but also around many dining tables. These days, when I mention I work at a think tank on artificial intelligence policy, the follow-up I get is, “What do you think about ChatGPT?”

The tsunami of attention is also evident in the rush by policymakers to demonstrate they are on top of this transformative threshold. On May 4, 2023, the White House called in leading CEOs for a meeting with Vice President Kamala Harris and senior staff and, alongside a drop-in visit from President Joe Biden, urged them to be responsible in AI development. The event was accompanied by an announcement of crowd-sourced testing of large language models (LLMs) at the next DEF CON conference and Office of Management & Budget guidance for federal agency adoption of AI, along with a recap of the White House Executive Order on bias, AI Bill of Rights, and other previous actions on AI. Two weeks later, the White House released an additional package of initiatives—this time focused on research and development, education, and gathering more public input.

On the other end of Pennsylvania Avenue too, Senate Majority Leader Chuck Schumer promptly announced that he is undertaking work on AI legislation. Soon after, he went to the Senate floor to report on assembling a bipartisan group to lead this work and, on June 21, he delivered a policy speech in which he outlined a “Safe Innovation Framework” for “comprehensive” AI legislation, with emphasis on innovation for American leadership on AI. This emphasis was balanced by a call to incorporate security (broadly conceived to include economic security), accountability, national values, and, perhaps most significantly, greater explainability for AI models. Schumer plans to flesh out his framework through a series of “insight forums” that will bring together a cross-section of experts to explore major AI issues outside (but alongside) the normal committee processes of the Senate.

The rapid pace of AI development in recent years presents many questions about human-computer interaction—from fundamental questions about how to control AI as machine learning grows more sophisticated, to a range of social policy issues stemming from the capabilities of generative AI, including how to adapt education and knowledge work (including the work of think tank researchers) and its impact on disinformation and democracy. To address some of these issues, the developers of AI systems, policymakers, and citizens will need to learn much more about the operation of the systems as well as their risks and benefits. As Senator Martin Heinrich (D-NM) put it, “this is going to be complicated stuff.”

Artificial intelligence policy is not starting from scratch, though. There is some consensus around measures that can help increase understanding and judgment about AI systems, their operation, and their effects. Algorithmic transparency, accountability, and fairness are the touchpoints of most AI governance frameworks and regulatory proposals. A 2019 survey in Nature of ethical AI guidelines and principles identified 84 such frameworks around the world. Of these, transparency in some dimension was reflected in 73; “justice & fairness” in 68; and “responsibility” in 60 (other salient points in common were beneficence and prevention of harm). So far, congressional proposals have followed similar lines. Disclosure of the uses and basis for algorithms is a significant element of Sen, Schumer’s framework. An early and influential bill in this regard has been the Algorithmic Accountability Act originally introduced in 2019 by Sen. Ron Wyden (D-OR) and Rep. Yvette Clark (D-NY), which focused on “algorithmic impact assessments” of AI systems defined as “high-risk” based on their potential impact on individuals.

Privacy legislation anticipates algorithm issues

Key privacy legislation has included provisions on algorithmic accountability and fairness, starting with the leading Senate bills from Sens. Maria Cantwell (D-WA) and Roger Wicker (R-MS) in the 116^th Congress. The bipartisan American Data Privacy and Protection Act (ADPPA) in the 117^th Congress built on these. The ADPPA was reported out of the House Energy & Commerce Committee in July 2022 by a 53-2 vote but did not get to the House floor during the end-of-year lame duck session. It is likely to be re-introduced in the current Congress and can be a vehicle for Congress to get something done on AI issues.

In my own work, I have previously pointed to comprehensive privacy legislation to get at competition and content issues, so I could be accused of treating privacy legislation as the hammer for every tech issue that looks like a nail. I do not mean to suggest that privacy legislation can solve all the issues presented by emerging AI. But comprehensive privacy legislation is long overdue and, unlike other unfolding issues, the ADPPA offers a vehicle that has already advanced a long way through the legislative process over several years of work and debate. Indeed, in the current Congress the House Energy & Commerce Committee held hearings specifically on privacy legislation in March and April, the latest among many over three successive congresses. The collection, use, and sharing of personal information is fundamental to many technology applications of concern, and the ADPPA contains provisions that address other important aspects of AI governance.

I am not alone in suggesting privacy legislation as a remedy for issues other than privacy per se. In the first hearing of the year for the Subcommittee on Innovation, Data, and Commerce of House Energy & Commerce, on “strengthening America’s competitiveness and beating China, full committee chair and ranking member Cathy McMorris Rodgers (R-WA) and Frank Pallone (D-NJ), subcommittee chair and ranking member Gus Bilirakis (R-FL) and Jan Schakowsky (D-IL) and Reps. Lisa Blunt Rochester (D-DE), Neal Dunn (R-FL), and Lori Trahan (D-MA), all called out the ADPPA as a necessary response, as did witnesses Samm Sacks and Brandon Pugh. When TikTok CEO Shou Chew testified before the committee in March, a long roster of members pointed to the volume of data collected by TikTok as part of a broader problem online: Rodgers, Pallone, Bilirakis, Rochester, and Trahan were joined by Reps. Richard Hudson (R-NC), Kat Cammack (R-FL), Kathy Castor (D-FL), Bill Johnson (R-OH), Yvette Clarke (D-NY), Jay Obernolte (R-CA), Greg Pence (R-IN), Anna Eshoo (D-CA), and Dan Crenshaw (R-TX). Privacy also came up during a May 2023 Senate hearing with OpenAI CEO Sam Altman, where Sens. Marsha Blackburn (R-TN), Richard Blumenthal (D-CT), and Jon Ossoff (D-GA) called for Congress to move forward in legislation for online privacy and data security—and so did Altman. These voices have been joined by commentators from Forbes, Stanford HAI, and the International Association of Privacy Professionals. And, according to Pew Research, the American people also are concerned about the impact of AI on their privacy.

The ADPPA contains several important provisions that would help address AI. First, the bill would place boundaries on collection, use, and sharing of personal information, requiring that such process be “necessary and proportionate” to what is necessary to provide a products or service or for other enumerated purposes. As I have previously written, such boundaries help to circumscribe the power of artificial intelligence to “magnif[y] the ability to use personal information in ways that can intrude on privacy interests by raising analysis of personal information to new levels of power and speed.” Indeed, data governance—its provenance, quality, and ethical use—is vital to AI governance frameworks. It is notable that, as foundational models have rolled out, it is privacy and data protection regulators in Europe and California that have stepped in to explore the sourcing and uses of personal data.

Second, the ADPPA contains a groundbreaking provision strengthening protections of civil rights laws in the digital sphere by prohibiting the use of personal information in ways that discriminate. As the White House AI Bill of Rights issued last year put it, the problems of bias in algorithms and automated system “are well documented, and poor design of algorithms or unrepresentative training data can embed historical inequities in life-affecting contexts like hiring, housing, and health care as well as criminal justice among others. The fact sheet accompanying the AI Bill of Rights called out the work of federal agencies to apply existing laws and mechanisms to address discrimination in housing, employment, and other opportunities. The ADPPA’s civil rights provision would reinforce these efforts.

Third, the ADPPA’s civil rights section also contained the provisions mentioned on algorithmic assessment and accountability. These would have required algorithmic assessments that touch key elements of AI guidelines and frameworks. Nevertheless, provisions can be improved with greater transparency and accountability for AI systems. The next section looks at how to incorporate these elements into the 2022 algorithm provisions of the ADPPA.

Adjusting privacy legislation for the ChatGPT moment

As reported out of the House Energy & Commerce Committee in July 2022, Section 207 of the ADPPA provided for two different tiers of algorithmic assessment. First, “large data holders” (entities with annual gross revenues of $250 million or more and data from five million or more individuals or devices) would be required to conduct more extensive “impact assessments.” These would have to be performed initially within two years of enactment and annually thereafter, detailing design, uses, training data, outputs, and enumerated risks. Second, all other covered entities would have to perform a more general “design evaluation” covering “design, structure, and the inputs” of algorithms and addressing the same risks enumerated for large data holder impact assessments. Both impact assessments and design evaluations would have to be filed with the Federal Trade Commission, with the option to make summaries of these available publicly. Given the increased saliency of concerns about AI, these provisions should be updated to require covered entities to make information about algorithmic decision-making available to the public and clarify the algorithms and risks that are covered and the need to measure outcomes.

Covered algorithms and entities

By the terms of the bill, the algorithmic impact assessment requirement for large data holders (Section 207 (c)(1)) applies to any algorithmic decision “that poses a consequential risk of harm to an individual or group of individuals and uses such algorithmic decision-making solely or in part to process covered data.” The more generally applicable “design evaluation” requirement (Section 207 (c)(2)) applies to any “consequential” algorithmic decision-making. The large data holder provision enumerates potential harms to avoid, and the design evaluation provision refers back to this enumeration. Even so, “consequential” by itself is too vague because it is not clear from the language what algorithmic decision-making is encompassed.

The ADPPA contains a definition of “covered algorithm.” It means:

A computational process that uses machine learning, natural language processing, artificial intelligence techniques, or other computational processing techniques of similar or greater complexity and that makes a decision or facilitates human decision-making with respect to [personal information], including to determine the provision of products and services or to rank, order, or promote, recommend, amplify, or similarly determine the delivery or display of information to an individual.

This language improves on versions that might have encompassed almost any computing used to assist human decisions even as routine as an Excel spreadsheet, but it is still overbroad. Recommendation engines are a concern when, for example, they produce invidious discrimination in product recommendations or lead children and teens down rabbit holes of harmful content, but not in every use. Rather than leave the meaning of “consequential” to the substantive provisions on algorithmic assessments, the definition of “covered algorithm” itself should elaborate the meaning of “consequential” to make clear that requirements for algorithmic assessment and accountability provision are a function of the consequences of algorithmic decisions and not of any use of algorithms as such.

This definition should incorporate the specific categories of harm identified in Section 207 (c)(1)(B)(vi) that identify contexts and consequences that should be considered, such as access to housing, employment, and health care. However, additional umbrella language should capture the broader universe of harms to which these categories belong. For example, the European Union’s General Data Protection Regulation applies a right to human review of automated decisions that have “legal effects or similarly significant effects” on individuals. But “similarly significant” is also broad and vague; it could be clarified with language like “opportunities and economic, social, and personal well-being.” This would cover the enumerated categories but also encompass other life-affecting applications as well.

In addition, the focus on any algorithmic tool that “facilitates” human decision-making is also overly broad, as that could sweep in decisions that involve substantial human agency in the loop. California Assembly Bill 331 (AB 331), as introduced in the 2023-24 session, contains better language: it would encompass an algorithmic process that is “a controlling factor” in a consequential decision.

There is an anomaly in the scope of the design evaluation provision. It covers a covered entity or service provider “that knowingly develops” a covered algorithm. This by its terms focuses focus only on the developer of an algorithm and not on a covered entity that uses an algorithmic process developed by another entity. This ventures into questions about the allocation of responsibility across the software value chain debated in connection with California AB 331 and the European Union’s proposed Artificial Intelligence Act and appears to leave out uses that may be consequential. Rather than address such obligations across the value chain, information privacy legislation should apply the obligation to conduct a design evaluation a covered entity that “employs” a covered algorithm. The ability to evaluate the design, structure, inputs, and outputs may be constrained under those circumstances, but some such inquiry is a necessary step toward responsible AI systems nonetheless.

Scope of assessments

Since the ADPPA was marked up last July, the National Institute of Standards & Technology issued version 1.0 of its AI Risk Management Framework (AI RMF) in January 2023. This framework provides a conceptual framework for identifying AI risk and a set of organizational processes for assessing and managing these risks. The framework is explicitly intended to be iterative and not one-size-fits-all. The requirements for risk assessments and design evaluation in the ADPPA should be reconsidered considering the NIST AI RMF. For example, to avoid making the design evaluation provision one-size-fits-all, it could take into account the size and complexity of the entity involved and the volume and sensitivity of data, paralleling language that recurs in a number of other ADPPA provisions (Sections 101 (c)(data minimization guidance),103 (b)(privacy by design), 203 (g)(regulations on individual control), and 208 ((a)(2)(data security)).

Above all, the elements of algorithmic assessments should more explicitly and distinctly incorporate measurement. In carpentry, the maxim is “measure twice, cut once.” For AI, the maxim needs to be measure, measure, measure, and measure again. As my Brookings colleague Nicol Turner Lee has written with Paul Resnick and Genie Barton, “It’s important for algorithm operators and developers to always be asking themselves: Will we leave some groups of people worse off as a result of the algorithm’s design or its unintended consequences?” (emphasis in original). This calls for audits or testing to ensure that AI systems are performing as expected, that steps to mitigate potential harms are working, and that adaptations from machine learning are not causing untoward results. Measurement is one of the core functions of the NIST RMF (along with governing, mapping, and managing), which recommends that “AI systems should be tested before their deployment and regularly while in operation” (emphasis added). The provision of the ADPPA on the scope of impact assessments (Section 207 (c)(1)(B)) refers to “outputs,” but should make explicit that its assessments should include measurement of outputs both before and after deployment.

In a similar vein, the language on design evaluation should tilt toward measurement as part of the evaluation. This provision (Section 207(c)(2)) should add “and expected outputs” to “design, structure, and inputs of the covered algorithm.”

The ADPPA should incorporate the NIST RMF into algorithmic assessments and design evaluations as a reference for guidance (though not a compliance safe harbor as such). This could be implied by a provision (Section 207(c)(4)) that provides for the issuance of Federal Trade Commission (FTC) guidance on Section 207 compliance “in consultation with the Secretary of Commerce,” who can delegate to NIST to drive. Explicit reference would help to drive uptake of the NIST RMF and development of standards and practices to inform compliance and enforcement. Because the NIST RMF will be updated iteratively and interactively on an ongoing basis, it is more dynamic than the usual agency rules or guidance.

Disclosure

The ADPPA provisions for assessments would require large data holders to document numerous features of the AI system being evaluated (Section 207 (c)(1)(B)) and to include this information in their reports to the FTC. Other covered entities would also have to submit their design evaluations. Making this information available to the public is entirely optional, however.

This leaves out an essential ingredient of AI transparency. As discussed above, transparency is a major element of frameworks for ethical and trustworthy AI, including the NIST Risk Management Framework. The ADPPA has a section on transparency (Section 202) that spells out information that must be disclosed about a covered entity’s collection and use of personal information and its privacy practices. These disclosures should also include information on algorithmic decision-making based on such data – the nature of such uses and decisions and the data and logic on which they are based.

While a small portion of individuals are likely to view such information, it would be more accessible to them than looking up reports at the FTC. Moreover, making these public statements could bring to bear the FTC’s enforcement authority against unfair and deceptive practices when entities conduct does not conform with their disclosures. There is a broad debate to be had about transparency across a wide range of technology applications from platform content moderation to AI explainability, among others. Sen. Edward Markey (D-MA) has introduced the Algorithmic Justice and Online Platform Accountability Act to address several of these. But, at a minimum, privacy legislation that covers collection, use, and sharing of personal information and requires disclosures about these should include disclosures on the use of personal information for algorithmic decisions that may affect individuals who are the subjects of that information.

* * *

There is a lot to learn yet to understand and address the challenges of AI systems. Without such understanding, some of the big questions raised by generative AI cannot be answered adequately. But effective measure to ensure algorithmic transparency and accountability are in privacy legislation and can help inform this discussion as well as mitigate important risks and provide a baseline for some of the applications of AI most likely to have an impact on individuals.