Anthropomorphic AI terms create gaps in accountability

Science fiction has long imagined machines that think, feel, or act like people. That imagery shapes our public conversations and understanding of artificial intelligence (AI) technology.

Headlines frequently borrow science fiction tropes to describe AI systems as if they can “lie,” “scheme,” or “want” things. While that shorthand is vivid, it could imply a level of agency where none exists.

Companies often do the same, marketing chatbots as “assistants” and “agents” or equating a model’s “intelligence” to that of someone with an advanced degree.

Policymakers have adopted this language as well, and some of it has even made it into legislative text, where tools are said to “make” consequential decisions or perform tasks “normally requiring human intelligence.”

Research shows that people readily apply human traits to technical systems. In everyday conversation, that framing may be harmless. In policy, however, human-like framing can blur responsibility and reinforce the misconception that AI systems act independently of the people and institutions that design, build, configure, and deploy these systems. In policy contexts, unclear framing can obscure accountability, weaken implementation, and reinforce industry narratives that dilute oversight.

Back to top

Why anthropomorphism is a governance problem

Humans reliably attribute human-like qualities to machines. The Eliza effect, coined in the 1960s, shows that users attribute empathy to simple pattern-matching dialogue. Related research on the “computers are social actors” (CASA) paradigm found that people respond to digital systems using the same social cues they use with other people, even when they know they are interacting with software.

This framing can lead people to treat AI systems as independent decisionmakers, rather than tools built, refined, and deployed by people and institutions. In practice, large language models (LLMs) and related systems generate outputs from patterns in the training data and produce the next most likely output or action under the model’s training and system settings. These systems do not possess intent, physical consciousness, or understanding. People often call false outputs “hallucinations,” but that label groups together very different problems, such as missing source material, unclear prompts, manipulation of the training data, or the model generating text that sounds plausible without being verified. Developers often train models on enormous datasets that can include errors, parody, spam, stereotypes, and outdated information.

Design choices also matter. Realistic voices, named personas, personality cues, and emotionally expressive chat output interfaces, can causally increase perceived human-likeness and trust, encouraging users to project human traits—including gender stereotypes—onto digital assistants. These features can make systems feel more helpful and trustworthy, but they can also encourage users to overshare, rely too heavily on the system, or form emotional attachments.

Researchers and journalists have documented cases in which prolonged interaction with companion-style chatbots appeared to intensify delusional thinking, emotional reliance, and severe mental health deterioration. Large-scale behavioral data suggests that some heavy users of personalized chatbots report higher loneliness and emotional dependency over time. When users perceive AI systems as intentional agents, users may overweigh system outputs relative to other sources of information.

The table below catalogs common anthropomorphic terms used in AI policy discourse, and, in some cases, legislative or regulatory text. Even when terms are not written in law, these framings routinely shape how policymakers describe systems, and those descriptions can end up shaping definitions, duties, and enforcement obligations.

Table 1

Back to top

Policy consequences of anthropomorphic framing

First, human-like framing can dilute accountability by portraying an output as the system’s “judgement” rather than the result of human choices about data, objectives, thresholds, interfaces, and deployment. When a policy reads, “AI makes a decision,” the wording can blur the distinction between the output generation and the decision to use the model’s output. In practice, people select the training data, objectives, thresholds, and the context in which models are deployed and outputs are used. The AI systems do not choose the objectives, design the optimization functions or weights, or determine the deployment contexts. Humans and institutions do. These choices remain central to liability and governance.

Second, anthropomorphic language creates risks for implementation. Describing a system as one that “sees” or “recognizes” does not tell a regulator what the system is supposed to do or how these functions could fail. A procurement specification that instructs a system to “see an apple” is less useful than one that defines the image conditions, performance threshold, and error tolerance. When statutory language relies on human-centric verbs, it increases the risk of misalignment between policy intent and technical execution. These descriptions can also weaken transparency by encouraging non-falsifiable claims about understanding instead of testable descriptions of model capabilities and limits.

Third, anthropomorphism can reinforce industry narratives. Technology firms have strong incentives to market products as intelligent, conversational, or agentic to drive adoption. Terms such as artificial intelligence and machine learning can function as placeholders for more precise descriptions of computational processes, enabling companies to market ordinary data-processing techniques as autonomous intelligence. However, this framing can also attribute agency to “autonomous” systems rather than corporations and dilute the scrutiny of business practices, dataset sourcing, and profit incentives. If a system “decided,” responsibility appears more ambiguous than if a company configured, tested, and deployed a model with known limitations. It also fuels public misconceptions that AI systems have measurable intelligence comparable to humans, or that these systems are beyond human control.

Finally, anthropomorphic language may cut against established governance principles. The OECD AI Principles emphasize human-centered values, transparency, and accountability. Similarly, in the U.S., the National Institute of Standards and Technology (NIST) AI Risk Management Framework’s Generative AI Profile stresses the accurate characterization of system capabilities, limitations, and failure models. When companies opt for anthropomorphic language or design choices, they may undermine these widely held principles of system transparency.

Some of these terms have legitimate technical origins: In early AI research, an agent was framed as a system selecting actions in an environment to optimize a performance measure; machine learning described parameter optimization from data; and memory referred to stored state (e.g., cached context or external retrieval stores). Nonetheless, when policymakers or marketers repurpose these technical terms as human-like descriptors, they can imply open-ended autonomy beyond what developers configure or human-like cognition.

Table 2

Back to top

What would technically precise policy vocabulary look like?

Statutes should prefer operational verbs. Systems detect, classify, cluster, score, rank, and optimize. “Decide” should be reserved for the human or institutional act that produces legal or material effects.
Statues should attribute agency explicitly to developers, deployers, and operators. Legal text can specify which entity determines a system’s purpose, sets decision thresholds, and decides whether and how outputs influence consequential decisions—and can establish avenues for civil society oversight.
Many state laws already name who must test, disclose, and monitor systems.
Policy should specify transparency in concrete non-anthropomorphic terms about data provenance, model architecture, and known limitations. Where full technical explanation is infeasible due to opacity or trade secrets, statutes can mandate documentation of training data categories, evaluation benchmarks, and performance disparities.
Lawmakers should name actors across the supply chain, rather than abstracting them into references to AI, to clarify oversight responsibilities. Instead of stating that “AI verifies identity,” statutes and agency guidance can specify that a particular contractor deploys a face-matching algorithm trained on specified datasets by a particular developer under defined thresholds.
Policymakers should scrutinize whether human-like design is necessary for a given use case. Emerging laws in California and New York address emotional dependence and over-trust by requiring disclosures when users interact with simulated personas and treat chatbots as products whose design choices and deployment contexts are subject to governance. In higher-risk domains such as mental health, employment, education, or public benefits, reducing anthropomorphic cues may mitigate overreliance and manipulation.

Precision in legislative language will not solve every governance problem, especially given information asymmetries and power imbalances, but it can improve institutional outcomes at the margin. In the AI context, anthropomorphic terminology can make accountability and implementation unclear. More operational language can clarify who builds and deploys systems, how they function, where risks occur, and which actors bear documentation, testing, monitoring, and liability obligations. It strengthens procurement, enforcement, and public understanding. AI systems may be multipurpose technologies, but they are still engineered products embedded in organizational decision chains. Policy should describe the technology accordingly.