Watch this robot cook shrimp and clean autonomously

Even cheap hardware can perform complex tasks, and AI is helping robots get smarter still.

Melissa Heikkiläarchive page

January 15, 2024

Stephanie Arnett/MITTR | Envato

Sophisticated robots don’t have to cost a fortune. Even relatively cheap robots can do complex manipulation tasks and learn new skills quickly using AI, a new study has shown.

With just $32,000, researchers from Stanford University managed to build a wheeled robot that can cook a three-course Cantonese meal with human supervision. Then they used AI to train it to autonomously do individual tasks such as cooking shrimp, cleaning up stains, and calling an elevator. Other robots that are capable of such complex tasks tend to cost hundreds of thousands of dollars, but the researchers kept the project’s costs low by choosing off-the-shelf robot parts and 3D-printed hardware.

The researchers taught the robot, called Mobile ALOHA (an acronym for “a low-cost open-source hardware teleoperation system for bimanual operation”), seven different tasks requiring a variety of mobility and dexterity skills, such as rinsing a pan or giving someone a high five.

To teach the robot how to cook shrimp, for example, the researchers remotely operated it 20 times to get the shrimp into the plan, flip it, and then serve it. They did it slightly differently each time so the robot learned different ways to do the same task, says Zipeng Fu, a PhD Student at Stanford, who was project co-lead.

Video courtesy of researchers.

The robot was then trained on these demonstrations, as well as other human-operated demonstrations for different types of tasks that have nothing to do with shrimp cooking, such as tearing off a paper towel or tape collected by an earlier ALOHA robot without wheels, says Chelsea Finn, an assistant professor at Stanford University, who was an advisor for the project. This “co-training” approach, in which new and old data are combined, helped Mobile ALOHA learn new jobs relatively quickly, compared with the usual approach of training AI systems on thousands if not millions of examples. From this old data, the robot was able to learn new skills that had nothing to do with the task at hand, says Finn.

Video courtesy of researchers.

While these sorts of household tasks are easy for humans (at least when we’re in the mood for them), they are still very hard for robots. They struggle to grip and grab and manipulate objects, because they lack the precision, coordination, and understanding of the surrounding environment that humans naturally have. However, recent efforts to apply AI techniques to robotics have shown a lot of promise in unlocking new capabilities. For example, Google’s RT-2 system combines a language-vision model with a robot, which allows humans to give it verbal commands.

“One of the things that’s really exciting is that this recipe of imitation learning is very generic. It’s very simple. It’s very scalable,” says Finn. Collecting more data for robots to try to imitate could allow them to handle even more kitchen-based tasks, she adds.

“Mobile ALOHA has demonstrated something unique: relatively cheap robot hardware can solve really complex problems,” says Lerrel Pinto, an associate professor of computer science at NYU, who was not involved in the research.

Mobile ALOHA shows that robot hardware is already very capable, and underscores that AI is the missing piece in making robots that are more useful, adds Deepak Pathak, an assistant professor at Carnegie Mellon University, who was also not part of the research team.

Pinto says the model also shows that robotics training data can be transferable: training on one task can improve its performance for others. “This is a strongly desirable property, as when data increases, even if it is not necessarily for a task you care about, it can improve the performance of your robot,” he says.

Next the Stanford team is going to train the robot on more data to do even harder tasks, such as picking up and folding crumpled laundry, says Tony Z. Zhao, a PhD student at Stanford who was part of the team. Laundry has traditionally been very hard for robots, because the objects are bunched up in shapes they struggle to understand. But Zhao says their technique will help the machines tackle tasks that people previously thought were impossible.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.