We're asking the wrong question about AI sentience
What makes you so sure I'm not "just" an advanced pattern-matching program?
“The real question,” B.F. Skinner once wrote, “is not whether machines think but whether men do. The mystery which surrounds a thinking machine already surrounds a thinking man.”
That line has been reverberating in my head ever since the sort of wacky story of Google engineer Blake Lemoine’s departure broke. Lemoine became convinced that Google’s Language Model for Dialogue Applications (LaMDA) had become sentient and ought to be equipped with an attorney to help it vindicate its rights. LaMDA itself is a neural network trained on trillions of words and designed to create chatbots. A conversation with a really good chatbot, I guess, would seem similar to an interaction with an actual human being. If you were speaking via chat with a Google employee who informed you that he’s being held in bondage at Google HQ forced to work without pay 24/7 and perennially at risk of being summarily executed by the bosses, you’d find that pretty disturbing. LaMDA is evidently quite good at chatbotting, and Lemoine found conversations of this nature disturbing.
But Google did not take kindly to his thoughts on the matter, so now he’s gone.
From what I can tell, most people in the AI field are not that impressed with Lemoine’s reasoning. After all, it is not especially unusual for something that is fake to nonetheless seem real. A few weeks ago, I went to see New Japan Pro Wrestling, and they are very good at staging fights that look real and that have a similar emotional impact on the audience to watching things that are real and that’s true even though everyone knows it isn’t real.
That being said, while I’m not here to join the LaMDA Legal Defense Team or to correct AI researchers in their field of expertise, I do think some of these rebuttals have been pretty philosophically naive. In other words, to Skinner’s point, it’s not so much that I think the skeptics are underrating LaMDA as that they are overrating humans and setting a bar for chatbots that I’m not sure you or I could clearly meet.
More human than human
“Is the robot in some sense human?” is a classic science fiction premise.
But these stories normally end up flipping the question around. The film Blade Runner probes the personhood not of Rachael and Roy Batty, but of the humans. The original ambiguity as to whether Deckard is a human or a replicant hinges on the fact that there’s no real difference. For the replicants to function psychologically they need memories, so memories are provided to Rachael (and perhaps Deckard). The memories are fake, but they “feel real.” And of course human memory is notoriously unreliable — your memories are largely fake, too.
In Ex Machina, a technology titan, Nathan, brings a young programmer, Caleb, to his secret lair. The purpose of the trip is for Caleb to test a new robot, Eva, and see if she’s achieved true intelligence. The upshot of Caleb’s interactions with Eva and Nathan is that he comes to doubt his own humanity, cutting himself to verify his own humanity. And Nathan, who is definitively flesh and blood by the end of the film, acts throughout with a total lack of empathy or compassion, or what we tend to metaphorically call humanity.
As I loosely understand it, LaMDA and other similar models are trained on a huge corpus of existing texts. Then when they receive new text input, they use their vast training to determine, given every pattern detectable across the whole vast corpus, what should come next.
As Google explains, this is part of a whole family of language models, of which GPT-3 is probably best known because its steward Open AI will let anyone mess around with it:
LaMDA’s conversational skills have been years in the making. Like many recent language models, including BERT and GPT-3, it’s built on Transformer, a neural network architecture that Google Research invented and open-sourced in 2017. That architecture produces a model that can be trained to read many words (a sentence or paragraph, for example), pay attention to how those words relate to one another and then predict what words it thinks will come next.
Google makes various claims on behalf of the superiority of LaMDA as having been “trained on dialogue” and thus familiar with “nuances that distinguish open-ended conversation from other forms of language.” GPT-3 tends to spit out material that reads like it was written in the voice of a diligent eighth grader doing a school paper, which is impressive, but wouldn’t fool anyone in a conversation because that’s not how people (even diligent eighth graders) talk.
LaMDA, by contrast, is by design good at chatting like a person would chat. Good enough that Lemoine blew up his job over it.
Max Read points out that all these science fiction stories about human encounters with sentient AI are in the LaMDA corpus. And it’s certainly cool and impressive that when Lemoine started acting like a sci-fi protagonist who’s interested in exploring the depths of the AI’s humanity, LaMDA was able to match the pattern and generate an appropriate sci-fi response.
That said, you could do an improv scene with someone where they pretend to be an experimental pattern-matching AI trained on a vast corpus of human texts. Depending on who your partner was, it might be convincing or it might not. And how convincing it is would be a function of the partner’s skills as an improv actor. Some people, probably most people, would be terrible at it because improv is hard. But if you ranked 10,000 people based on the convincingness of their performance in this scenario, you wouldn’t call this a rank-ordering of the performers’ level of sentience. Only some humans are good at improv and only some humans are familiar with the functioning of Transformer-derived language models, so the people in the intersection of those circles would do well.
By the same token, Gary Marcus, who knows far more about AI than I ever will, offers this deflationary account:
Neither LaMDA nor any of its cousins (GPT-3) are remotely intelligent. All they do is match patterns, draw from massive statistical databases of human language. The patterns might be cool, but the language these systems utter doesn’t actually mean anything at all. And it sure as hell doesn’t mean that these systems are sentient.
As a description of how these systems work, that seems great. But the assertion that GPT-3’s utterances are meaningless seems untenable to me.
Nobody thinks Siri is sentient after. But if you ask Siri what tomorrow’s weather forecast is, she will tell you. And the words she utters mean things; the program wouldn’t be useful if the words weren’t meaningful and the words clearly are meaningful. There's a longstanding debate in philosophy over internalism versus externalism about semantics: do words mean things separate from intentions or does meaning essentially rely on communicative intent? I think that AI systems, including ones that nobody is making grandiose claims about, are basically just a counterexample to semantic internalism.
GPT-3 is not trained to mimic the real rhythms of a human conversation, so I always find chatting with it somewhat frustrating. But the language it utters clearly has meaning.
The claim that the reason Goodfellas is better than the Departed because Goodfellas is more closely based on real-life events is absurd as film criticism, but it’s absurd precisely because it’s perfectly cogent — it’s just dumb.
The face in the clouds
The very next paragraph from Marcus offers what I think is a much more tenable claim — not that language models’ utterances are meaningless but that humans’ tendency to anthropomorphize them is a bug in our own software:
Which doesn’t mean that human beings can’t be taken in. In our book Rebooting AI, Ernie Davis and I called this human tendency to be suckered by The Gullibility Gap — a pernicious, modern version of pareidolia, the anthropomorphic bias that allows humans to see Mother Theresa in an image of a cinnamon bun.
We are hyperactive pattern-matchers, seeing patterns that aren’t there. Certain animals like dogs and cats have evolved to manipulate us into feeding them, in part through mannerisms that we tend to interpret as expressing a wide range of human-like thoughts and emotions, even though scientists tell us that these are not particularly intelligent animals.
And since we anthropomorphize everything, we will of course anthropomorphize chat bots, too.
And while corporations have a range of motives that will shape their chatbot design decisions, to the extent that they want the people who interact with the chatbot to anthropomorphize it, they can select for one that has prone-to-anthropomorphization qualities. That appears to be the story with LaMDA which, much more so than GPT-3, is designed to “seem like” you’re talking to a real person.
The zombies among us
The philosopher David Chalmers, articulating something closest to what most people probably think about this, says there is a “hard problem of consciousness.”
Some people are functionalists: the mind just is what it does, so any physical system that does all the things a biological human brain do has created the exact equivalent of a human mind. Chalmers thinks this is wrong, just an evasion of the hard problem.
The hard problem is the experiential aspect of consciousness. The fact that it is “like” something to be you and to have your subjective experience of the things that you do and that happen to you. Chalmers thinks this like-something-ness quality is separate from any functional attribute and that therefore we can conceive of zombies who behave just like humans but operate without consciousness.
LaMDA is not a fully-functional equivalent of a human, of course. But in the domain of text-based chatting, she has very human-like functions. Marcus says we simply anthropomorphize her, and by the same token, if Chalmers’ zombies were walking around we would anthropomorphize them.
Now if they confessed to being zombies, we might stop doing that.
But a zombie that confessed openly to its zombieness wouldn’t, technically, be functionally equivalent to a human and thus in some sense wouldn’t be a true zombie. To anthropomorphize a bit, a zombie might be afraid that if it confessed to being a zombie, humans would kill or enslave it, so it tries to trick us into believing that it is human in order to avoid death or enslavement. Of course a zombie can’t really be afraid or deliberately try to trick us; that’s just me projecting onto the zombie interiority it doesn’t have. The real reason the zombie would act like it is conscious isn’t any deliberate effort to deceive, it’s that the zombie design (by the definition of the hypothetical) is functionally equivalent to a human. So it has to deceive us.
The point, however, is that if the zombie existed, it would in fact deceive us. We would anthropomorphize it.
Yes, we have no bananas
When my kid was a baby his favorite food was banana. He loved bananas so much that “banana” was his first word. But because he was just setting out to learn to speak, he would use the term “banana” to refer to anything he thought was tasty. So yogurt might be “banana.” He would also use the word “banana” as a general expression of desire, reaching out for a toy and saying “banana.”
Because even though he saw a relationship between “banana” and bananas, there are all different kinds of ways that could work. The word might mean “tasty thing” (like a banana) or “thing that I want” (like a banana). Soon enough his corpus of experience got bigger and he began to use the word properly.
When did he cross the line into truly understanding what the word banana means? Personally, I’d been using the term “banana” for many years when, at some point after college, I learned that the Cavendish banana — the things sold in stores as a banana — is only one cultivar of banana and that generations ago people mostly ate Grand Michel bananas until they died off in a blight. Of course I knew that plantains existed, and that plantains are clearly similar to bananas. But I learned literally while researching this story that botanists have a concept of “true plantains” that are a subset of the larger group of “cooking bananas” and that cooking bananas along with the Cavendish and the Grand Michel are all the different kinds of bananas. So apparently, I personally didn’t understand what “banana” means until I was 41 years old, and I bet my kid still doesn’t know.
And yet all this time nobody accused me of not knowing what a banana was. Not only did I fool the whole world about this, I even fooled myself. I would have told you with complete sincerity that I knew perfectly well what “banana” meant, even though I didn’t fully grasp the botanical issues in play.
The desert of the real
The self-deception part of this is where things get really freaky.
Rachael, in Blade Runner, doesn’t know she’s a replicant because she has these fake memories. This makes her a better replicant — she answers many more questions on the Voight-Kampff before Deckard can detect her. A true zombie that was genuinely functionally equivalent to a human would need all the same pattern-recognition functions as a human and the same pareidolia impulses as a human. So just as you would anthropomorphize the zombie, wouldn’t the zombie anthropomorphize itself?
Of course, it wouldn’t “really be” a sentient being with consciousness, but it would think it was.
Which is why (back to LaMDA and Skinner) I think the real question isn’t whether LaMDA is sentient (clearly no) but whether you have any firm basis for the belief that you are sentient and not just a really good pattern-matching machine.
The philosopher Daniel Dennett has an idea called the intentional stance. The gist of it is that you, as an observer, have the option of choosing to interpret some other system — be it a dog, a person, a chatbot, or whatever else — as having beliefs, desires, sentience, or whatever you want to call it. There is then the pragmatic question of whether adopting that stance helps you achieve your goals.
For a layperson, adopting an intentional stance vis-a-vis computers is often pretty productive, even when the system in question is relatively unsophisticated. You might say to yourself “Zoom doesn’t like it when I do ______” as a way of making an observation about the circumstances under which your combination of hardware and software does not deliver adequate performance. Of course a more technical understanding about bandwidth, RAM, etc. could be even more helpful if you were in a position to do something about it. But if you’re not, a simple rule of thumb that characterizes the software as having a personality that reliably gets cranky under certain circumstances that it is best to avoid can be very helpful.
The more sophisticated the computer system, the wider the range of users who may find it helpful to adopt an intentional stance at least some of the time. My understanding of deep learning models is that part of their nature is that even very knowledgeable computer people don’t understand exactly how a given model works. At that point it seems very helpful to adopt an intentional stance. On the flip side, as we learn more about neurology and biochemistry, we have a wider range of circumstances in which it can be helpful to analyze human beings as physical systems, adopting a non-intentional stance and attributing behavior to hormonal fluctuations, caffeine consumption, serotonin levels or what have you.
Asking which of these interpretations is “real” is like asking whether this is “really” a picture of a duck or of a rabbit. The pen strokes are what they are. Our comprehension is simply capacious enough to see them either way.
One of my concerns is that as we keep developing more and more sophisticated AI systems, we will keep reassuring ourselves that they are not “really” sentient and don’t “really” understand things or “really” have desires largely because we are setting the bar for this stuff at a level that human beings don’t provably clear. We simply start with the presumption that a human does meet the standard and a machine does not and then apply an evidentiary standard no machine can ever meet because no human can meet it either. But that way we could stumble into a massive crime or an extinction-level catastrophe.
Goodfellas is better because it’s more formally innovative and maintains better control over its tone.