Toyota’s robots probably can’t whip up a frittata yet, but they can whisk some eggs for you.
If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.
Yeah, so Toyota Research Institute (TRI) used generative AI in a “kindergarten for robots” to teach robots how to make breakfast — or at least, the individual tasks needed to do so — and it didn’t take hundreds of hours of coding and errors and bug fixing. Instead, researchers accomplished this by giving robots a sense of touch, plugging them into an AI model, and then, as you would a human being, showing them how.
The sense of touch is “one key enabler,” researchers say. By giving the robots the big, pillowy thumb (my term, not theirs) that you see in the video below, the model can “feel” what it’s doing, giving it more information. That makes difficult tasks easier to carry out than with sight alone.
Ben Burchfiel, the lab’s manager of dexterous manipulation, says it’s “exciting to see them engaging with their environments.” First, a “teacher” demonstrates a set of skills, and then, “over a matter of hours,” the model learns in the background. He adds that “it’s common for us to teach a robot in the afternoon, let it learn overnight, and then come in the next morning to a working new behavior.”
The researchers say they’re attempting to create “Large Behavior Models,” or LBMs (yes, I also want this to mean Large Breakfast Models), for robots. Similar to how LLMs are trained by noting patterns in human writing, Toyota’s LBMs would learn by observation, then “generalize, performing a new skill that they’ve never been taught,” says Russ Tedrake, MIT robotics professor and VP of robotics research at TRI.
Using this process, the researchers say they’ve trained over 60 challenging skills, like “pouring liquids, using tools, and manipulating deformable objects.” They want to up that number to 1,000 by the end of 2024.
Google has been doing similar research with its Robotic Transformer, RT-2, as has Tesla. Similar to the approach of Toyota’s researchers, their robots use the experience they’ve been given to infer how to do things. Theoretically, AI-trained robots could eventually carry out tasks with little to no instruction other than the kind of general direction you would give a human being (“clean that spill,” for instance).
But Google’s robots, at least, have a ways to go, as The New York Times noted when writing about the search giant’s research. The Times writes that this sort of work is usually “slow and labor-intensive,” and providing enough training data is much harder than just feeding an AI model gobs of data you downloaded from the internet, as the article demonstrates when describing a robot that identified a banana’s color as white.