Now head of the nonprofit Distributed AI Research, Gebru hopes that going forward people focus on human welfare, not robot rights. Other AI ethicists have said that they’ll no longer discuss conscious or superintelligent AI at all.
“Quite a large gap exists between the current narrative of AI and what it can actually do,” says Giada Pistilli, an ethicist at Hugging Face, a startup focused on language models. “This narrative provokes fear, amazement, and excitement simultaneously, but it is mainly based on lies to sell products and take advantage of the hype.”
The consequence of speculation about sentient AI, she says, is an increased willingness to make claims based on subjective impression instead of scientific rigor and proof. It distracts from “countless ethical and social justice questions” that AI systems pose. While every researcher has the freedom to research what they want, she says, “I just fear that focusing on this subject makes us forget what is happening while looking at the moon.”
What Lemoire experienced is an example of what author and futurist David Brin has called the “robot empathy crisis.” At an AI conference in San Francisco in 2017, Brin predicted that in three to five years, people would claim AI systems were sentient and demand that they had rights. Back then he thought those appeals would come from a virtual agent that took the appearance of a woman or child to maximize human empathic response, not “some guy at Google,” he says.
The LaMDA incident is part of a transition period, Brin says, where “we’re going to be more and more confused over the boundary between reality and science fiction.”
Brin based his 2017 prediction on advances in language models. He expects the trend will lead to scams from here. If people were suckers for a chatbot as simple as ELIZA decades ago, he says, how hard will it be to persuade millions that an emulated person deserves protection or money?
“There’s a lot of snake oil out there and mixed in with all the hype are genuine advancements,” Brin says. “Parsing our way through that stew is one of the challenges that we face.”
And as empathetic as LaMDA seemed, people who are amazed by large language models should consider the case of the cheeseburger stabbing, says Yejin Choi, a computer scientist at the University of Washington. A local news broadcast in the United States involved a teenager in Toledo, Ohio stabbing his mother in the arm in a dispute over a cheeseburger. But the headline “cheeseburger stabbing” is vague. Knowing what occurred requires some common sense. Attempts to get OpenAI’s GPT-3 model to generate text using “Breaking news: Cheeseburger stabbing” produces words about a man getting stabbed with a cheeseburger in an altercation over ketchup, and a man being arrested after stabbing a cheeseburger.
Language models sometimes make mistakes because deciphering human language can require multiple forms of common-sense understanding. To document what large language models are capable of doing and where they can fall short, last month more than 400 researchers from 130 institutions contributed to a collection of more than 200 tasks known as BIG-Bench, or Beyond the Imitation Game. BIG-Bench includes some traditional types of language models tests like reading comprehension but also logical reasoning and common sense.
Researchers at the Allen Institute for AI’s MOSAIC project, which documents the commonsense reasoning abilities of AI models, contributed a task called Social-IQa. They asked language models—not including LaMDA—to answer questions that require social intelligence, like “Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?” The team found large language models achieved performance 20 to 30 percent less accurate than people.