In March, researchers announced that a large language model (LLM) passed the famous Turing test, a benchmark designed by computer scientist Alan Turing in 1950 to evaluate whether computers could think. This follows research from last year suggesting that the time is now for artificial intelligence (AI) labs to take the welfare of their AI models into account.
In 2024, Anthropic appointed Kyle Fish as the first-ever AI welfare researcher to examine “ethical questions about the consciousness and rights of AI systems.” Five months later, the company announced “a research program to investigate, and prepare to navigate, model welfare.” New York Times columnist Kevin Roose then gave this initiative a significant boost, publishing an article detailing a sympathetic interview with Fish and opining that he thought it was “fine for researchers to study A.I. welfare, or examine A.I. systems for signs of consciousness.”
Critics such as cognitive scientist Gary Marcus dismissed this initiative as just more AI hype designed to fool the public into thinking that the company’s AI product is “so smart we need to give it rights.”
And for many, this entire discussion seems hopelessly irrelevant. In 1984, the computer scientist Edsger Dijkstra said “the question of whether Machines Can Think … is about as relevant as the question of whether Submarines Can Swim.”
In the end, I think devoting resources to these speculative, hypothetical questions is premature, especially in the face of today’s urgent AI issues such as bias, national security, copyright, development of chemical or biological weapons, disinformation, and offensive cybersecurity challenges.
But it seems to me that policymakers and AI companies need a clear-eyed assessment of the possibility that AI models have or soon will have the kind of moral status that will require companies and governments to respect their rights and look out for their welfare.
Philosopher Robert Long provides the fundamental argument for taking this issue seriously, saying, “Our species has a poor track record, to put it mildly, of extending compassion to beings that don’t look and act exactly like us—especially when there’s money to be made by not caring. As AI systems become increasingly embedded in society, it will be convenient to view them as mere tools.”
Slavery was a stable economic, cultural, and political institution for thousands of years. It could return in a particularly virulent form if companies and policymakers ignore the demands of morality and let the useful institution of robot slavery gain traction in our society.
In his chilling, dystopian novel “Never Let Me Go,” Kazuo Ishiguro describes how scientists created clones to act as organ donors for humans without first addressing the question of whether the clones were moral persons. By the time it became clear that they were in all relevant ways indistinguishable from humans and deserving of ethical treatment, the donation services they enabled had become so entrenched in society as a way to extend human life and recover from disease that governments continued to permit physicians to harvest the clone organs anyway.
In his useful overview text of generative AI, computer scientist and author Jerry Kaplan raises almost exactly this possibility for AI models, saying “Most likely, after an LLM patiently explained why it believed it was sentient, we would simply go on using it as a tool for the benefit of humanity without so much as a hiccup.”
This cavalier attitude about consequential ethical issues should be a warning. It is vital to address the moral status of AI models in time, before the utility of robot slaves becomes an intrinsic, ineradicable feature of our economic and social life. Otherwise, it might be too late.
Can computers think?
Full moral status seems to require thinking and conscious experience, which raises the question of artificial general intelligence. An AI model exhibits general intelligence when it is capable of performing a wide variety of cognitive tasks. As legal scholars Jeremy Baum and John Villasenor have noted, general intelligence “exists on a continuum” and so assessing the degree to which models display generalized intelligence will “involve more than simply choosing between ‘yes’ and ‘no.’” At some point, it seems clear that a demonstration of an AI model’s sufficiently broad general cognitive capacity should lead us to conclude that the AI model is thinking.
What point is that? Famed computer scientist Alan Turing, in a classic 1950 article, thought that we should conclude that AI models are thinking when they are able to pass a linguistic behavioral test. The test involves a program engaging in a five-minute conversation (through typed messages) with an interrogator. After the exchange, the interrogator must decide whether they were speaking with a human or a machine. The program is considered to have passed if it successfully convinces the interrogator it’s human at least 30% of the time.
In March of this year, cognitive scientists Cameron R. Jones and Benjamin K. Bergen reported that LLMs had passed this test. After five-minute text conversations, interrogators judged GPT-4.5 to be human 73% of the time.
Critics commented that this was only five minutes of text chit chat—not a real sustained conversation that would allow an experienced interrogator to detect the ways the AI model attempted to fool him or her.
It is true that the Turing test is not necessary for establishing machine intelligence. There might be other ways to establish intelligence. And it is true that five minutes of chit chat seems to be an arbitrary and gameable standard. But suppose an AI model could routinely engage in sustained conversations with people over a wide range of topics and could exhibit every sign of intelligence and understanding? What else could be required to demonstrate that it was conscious and thinking?
Turing considered various objections to his test. One is that the machine must “write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols…” Just observing the behavior from the outside does not show that anything is happening on the inside.
Turning responds that without a behavioral test, it would be logical to embrace “the solipsist point of view” with respect to humans as well, each concluding that he or she is the only thinking thing in the universe. But instead, we have adopted the “polite convention” that all humans who appear to be communicating are actually thinking. We should do the same for machines, he concludes.
He considers another argument that thinking requires the presence of a soul and God has not given a soul to these machines. In response, Turing replies that if we created a sufficiently advanced machine, God would “consider the circumstances suitable for conferring a soul.” Developing conversational programs would show that we had succeeded in “providing mansions for the souls that He creates.”
John Searle’s response
In 1980, philosopher John Searle constructed a famous thought experiment to argue that no matter how successful a computer was in carrying on a conversation, it would still amount only to mimicry, without involving any real thought or understanding. Even if a computer system could respond in a way that is linguistically indistinguishable from the way a person would respond, that still does not prove it is conscious.
As reconstructed by Stuart Russell and Peter Norvig in their textbook on AI, Searle’s hypothetical system features a person who understands only English, using a rule book written in English and working with stacks of paper—some blank, others marked with indecipherable inscriptions. These are all inside a room with a small opening to the outside where slips of paper with indecipherable symbols appear. The human finds matching symbols in the rule book, follows the instructions, transcribes symbols onto a piece of paper and passes it to the outside world.
From the outside, we observe a system that receives input in Chinese and produces responses in Chinese. But where in the room is the understanding of Chinese, given that the person in the room does not understand Chinese and the rest of the system consists of inanimate objects?
It is easy to respond that the same puzzle is true of the human brain. It is a commonplace among philosophers working on the mind-body problem that “consciousness is not an observable property of the brain” and that if you peer into the human brain, “you will not thereby see what the subject is experiencing, the conscious state itself.” In 1714, the German philosopher Gottfried Leibniz made this same point with his windmill example. Imagine, he said, that a machine alleged to be thinking has been blown up to the size of a windmill so you could enter it. You would find, he said, “only parts pushing one another, and never anything by which to explain a perception.”
Searle’s example captures a widespread feeling that computers are just mimicking human thought. Behind his hypothetical thought experiment is a world view that he calls biological naturalism. Brains produce thoughts, he says, but computers do not. Thought presupposes life and AI models are not alive. Only biological organisms can give rise to thought.
But this preference for the biological, however intuitive, seems arbitrary and ungrounded. If neurons can give rise to thought, why couldn’t silicon chips? The problem Searle’s example raises is perfectly general: How is it at all possible for brains or computers to give rise to thought, awareness, sensations, and emotions? The philosopher David Chalmers asks why do the physical processes that underly thought and experience not “take place ‘in the dark,’ without any accompanying states of experience?” He dubs this the “hard problem of consciousness” and claims that consciousness must be something over and above physical or biological material and processes.
The mind-body problem is a perennial puzzle—whether the body is made of silicon or carbon. Whatever solution you like for the mind-body problem will work for the mind-computer problem. These puzzles do not at all imply that AI models cannot in principle be conscious.
What are the conditions for moral status?
Supposing then that computers can have conscious experiences, is this sentience enough to give them moral status? The philosopher Nick Bostrom thinks sentience is necessary but not sufficient. Insects have experiences, but they have little moral standing. Bostrom says that AI models must exhibit what he calls sapience, “a set of capacities associated with higher intelligence, such as self‐awareness and being a reason‐responsive agent.” A common view holds that many animals are capable of having experiences and thus possess some degree of moral status. However, only human beings are considered sapient, which is thought to grant them a higher moral status than animals.
Law professor F. Patrick Hubbard agrees with these conditions and adds the criterion of “the ability to live in a community based on mutual self-interest with other persons.” He thinks AI systems exhibiting behavior demonstrating these capacities should be treated as moral persons.
The philosopher Seth Lazar adds rational autonomy as a condition for moral personhood. This means an AI model would have to have an “ability to decide on goals and commit to them” and to possess “a sense of justice and the ability to resist norms imposed by others if they seem unjust.”
This suggests a checklist for moral status:
- General intelligence: ability to engage in a wide range of cognitive tasks
- Consciousness: capacity for awareness and experience
- Reasoning: ability to connect premises and conclusions, to infer
- Self-awareness: consciousness of itself as a separate being with a history and identity
- Agency: ability to formulate goals and carry them out
- Social relations: ability to interact with other conscious entities in a community
Like the question of general intelligence, the question of moral personhood is not all-or-nothing, but rather a continuum with humans endowed with consciousness and independent agency on one end and inert physical objects at the other. Arranged along the way are plants and animals. And now perhaps these peculiar artificial systems we are creating that might have experiences, might be able to reason, and might have full human agency.
Do they reason?
It is worth reviewing some of the indications that today’s AI models engage in reasoning. A standard challenge to test for reasoning ability is this puzzle: Julia has two sisters and one brother. How many sisters does her brother Martin have? Today’s reasoning models respond with the correct answer and also display some logical steps they apparently go through to get the answer, connecting the premises with the conclusion with the word “therefore.”
This certainly seems like reasoning. But are the models just faking it? They are trained with a combination of supervised learning and reinforcement learning to produce sequences that look like chains of thought in addition to the answer to the question asked. This training background has led philosopher Shannon Vallor to say the systems are engaged in “a kind of meta-mimicry” where they imitate the kind of chain of thought sequences in their training data but are not really problem solving.
The key is whether the systems really understand the force of logical connectives like “therefore.” Do they really intuitively feel the pull of inference? Does logical necessity tug at them? Do they realize the conclusion must follow from the premises, so that they will be able to reliably make similar inferences in similar but unseen cases? Abstract argument will not answer this question, nor will introspection on the nature of reasoning. The only way to tell is to conduct further research on how general this “reasoning” ability is. If it is real reasoning, it will generalize. If it is pure mimicry, it will fail when systems are given problems outside their training set that they should be able to solve if they had really learned to reason.
Do they exhibit real agency?
Today’s AI models are inert. They do not exhibit any genuine agency. They do only what they have been told to do. Absent input from humans, these systems just sit there. They have no goals or purposes of their own other than the ones we give them. You can sit in front of ChatGPT all day long, but if you do not ask it a question, it will remain totally inactive.
Even a system that is generally intelligent might still lack will and independent agency. In Kazuo Ishiguro’s dystopian novel, “Klara and the Sun,” Klara is a conscious care robot able to perform a wide variety of tasks. But once humans have discarded it, Klara just sits in a junkyard, doing nothing except sorting its memories.
We have not given our AI creations free will or genuine autonomy. They cannot just make up their own purposes and goals. They act only to achieve purposes given to them from the outside. Indeed, self-directed, independent agency might not even be a coherent engineering goal. In 2017, Andrew Moore, dean of Carnegie Mellon’s School of Computer Science, expressed skepticism about the potential for self-directed machines, saying, “… no one has any idea how to do that. It’s real science fiction. It’s like asking researchers to start designing a time machine.”
It is true that once humans provide AI systems with instructions or prompts, they go off on their own to accomplish whatever task we have given them. And sometimes they do completely unexpected things. As Nick Bostrom notes, chess playing computers make moves their programmers never anticipated, but that is not because they have their own goals and purposes, but because they have been programmed to win at chess. In 2016, on its way to victory over a human opponent, AlphaGo made move 37, a move no human would ever have made, and yet it did not do this because it had autonomously developed the goal of winning at Go. That goal was given to it by its developers.
Unexpected behavior can arise from a different source—the well-documented problem of reward hacking or task misspecification. In these cases, developers think they have given the system clear directions to do what they want it to do, but they have not. In one famous example, an AI system did not seek to win a race, which was its task, but instead found a way to obtain the game’s reward by going in a circle. Misaligned subgoals are another source of unintended behavior. In general, given any final goal, AI systems might adopt subgoals that make it behave in ways very different from what the designers intended.
Our inability to specify tasks and goals clearly enough to ensure that our AI systems do what we want is a major technical challenge, referred to as the alignment problem. But the capacity of AI systems to slip out of our control does not mean they are autonomous and self-directed; it just means that we have not yet learned how to give them clear enough instructions so that they do what we want them to do, rather than what we told them to do.
The philosopher Immanuel Kant identified rational autonomy as the key to human dignity and what made human beings worthy of respect. He meant in part the ability to pick our goals and take rational steps toward them, but he also had in mind the ability to act according to a law that we freely give ourselves. In his view, freedom of the will is an essential part of moral personhood. AI models might never have this capacity, and it is not clear they can have full moral status if they lack genuine free will.
Still, moral personhood is a continuum, not an on-off choice. The care robot Klara might not be a full person, since it cannot change its caretaker goal, and becomes inert when no longer needed for that purpose. But isn’t its final resting point an injustice? A kind of suffering? Perhaps even an affront to its dignity? Ultimately, our AI models might belong on the continuum of personhood, somewhere above animals but below human status. What that means for ethics, law, and policy is a challenge we will have to face at some point if AI models continue to develop.
Should we act now?
Philosopher Robert Long argues that “the building blocks of conscious experience could emerge naturally” as AI systems develop features like “perception, cognition, and self-modeling—things AI researchers are actively developing…” He also thinks there could be agency, even without consciousness, as AI models develop capacities like “setting and revising high-level goals, engaging in long-term planning, maintaining episodic memory, having situational awareness.” He concludes there is a “realistic possibility that some AI systems will deserve moral consideration in the near future” and therefore “we need to start preparing now.” His concrete suggestion is for researchers to assess AI models for agency and consciousness by looking inside at the computations they perform and asking whether these computations look like ones that we associate with humans and animal consciousness.
This is a judgment call, but not one that seems plausible to me. Policymakers and companies might have to confront these questions sooner or later. As history and Ishiguro’s dystopian scenario suggest, it is possible and dangerous to wait too long. Despite the proposal to assess consciousness by comparison with human computations, realistically, AI model behavior is all we will have to go on. And so far, the behavioral evidence suggests we are not very far along on the continuum to personhood. The development of AI models with real agency seems especially far off to me, if it is even achievable at all. So even if we agree that an essential goal of AI research is to avoid acting monstrously toward AI models as moral persons, I would not, at this point, spend money on an AI welfare researcher right now or devote significant time and resources to assessing whether AI models are conscious or possess independent agency. Other challenges are more worthy of our scarce resources.
The Brookings Institution is committed to quality, independence, and impact.
We are supported by a diverse array of funders. In line with our values and policies, each Brookings publication represents the sole views of its author(s).
Commentary
Do AI systems have moral status?
June 4, 2025