MIT News: Feature story

The robots are coming. And that’s a good thing.

MIT's Daniela Rus isn’t worried that robots will take over the world. Instead, she envisions robots and humans teaming up to achieve things that neither could do alone.

March 5, 2024

Jack Snelling

In this excerpt from the new book, The Heart and the Chip: Our Bright Future with Robots, CSAIL Director Daniela Rus explores how robots can extend the reach of human capabilities.

Years ago, I befriended the biologist Roger Payne at a meeting of MacArthur Foundation fellows. Roger, who died in 2023, was best known for discovering that humpback whales sing and that the sounds of certain whales can be heard across the oceans. I’ve always been fascinated by whales and the undersea world in general; I’m an avid scuba diver and snorkeler. So it was no surprise that I thoroughly enjoyed Roger’s lecture. As it turned out, he found my talk on robots equally fascinating.

“How can I help you?” I asked him. “Can I build you a robot?”

A robot would be great, Roger replied, but what he really wanted was a capsule that could attach to a whale so he could dive with these wonderful creatures and truly experience what it was like to be one of them. I suggested something simpler, and Roger and I began exploring how a robot might aid him in his work.

When we first met, Roger had been studying whales for decades. One project was a long-term study on the behavior of a large group of southern right whales. These majestic mammals are 15 meters in length, with long, curving mouths and heads covered with growths called callosities. Roger had built a lab on the shores of Argentina’s Peninsula Valdés, an area that is cold, windy, and inhospitable to humans. The southern right whales love it, though. Every August they gather near the coast to have babies and mate. In 2009, Roger invited me to join him down at his lab. It was one of those invitations you just don’t decline.

Roger had been going to Peninsula Valdés for more than 40 years. Each season, he’d sit atop a cliff with binoculars and paper and pencil, and note which of his aquatic friends were passing by. Roger could identify each of the returning mammals by the unique callosities on their heads. He monitored their behavior, but his primary goal was to conduct the first long-term census of the population. He hoped to quantify the life span of these magnificent creatures, which are believed to live for a century or more.

As we started planning the trip, I suggested using a drone to observe the whales. Two of my former students had recently finished their degrees and were eager for an adventure. Plus, they had a robot that, with some minor adjustments, would be perfect for the task. After much discussion, reengineering, and planning, we brought along Falcon, the first eight-rotor drone that could hold a camera between its thrusters. Today such drones can be bought off the shelf, but in 2009, it was a breakthrough.

Roger was besotted with his new research assistant, which offered a clear view of the whales for several miles without prompting behavioral changes.

The clifftop vantage point from which Roger and his researchers had been observing the whales was better than being in the water with the great creatures, as the sight of divers would alter the whales’ behavior. Helicopters and planes, meanwhile, flew too high and their images were low resolution. The only problem with the cliff was that it was finite. The whales would eventually swim away and out of view.

Falcon removed these limitations and provided close-up images. The drone could fly for 20 to 30 minutes before its batteries ran down, and was capable of autonomous flight, though we kept a human at the controls. Immediately, Roger was besotted with his new research assistant, which offered him and his team a clear view of the whales for several miles without prompting any behavioral changes. In effect, they were throwing their eyes out over the ocean.

It’s far from the only way to use drones to extend the range of human eyes. After the whale project, we lent a drone to Céline Cousteau, the documentary film producer and granddaughter of the celebrated marine scientist Jacques Cousteau. She was studying uncontacted tribes in the Amazon and wanted to observe them without the risk of bringing germs like the cold virus to people who had not developed immunity.

In my lab, we also built a drone that launched from a self-driving car, flew ahead of the vehicle and around corners to scan the crowded confines of our subterranean parking garage, and relayed its video back to the car’s navigation system—similar to the tech that appears in the 2017 movie Spider-Man: Homecoming, when the superhero, clinging to the side of the Washington Monument, dispatches a miniature flying robot to scan the building. NASA pushed this application even further with Ingenuity, the drone that launched from the Perseverance rover to complete the first autonomous flight on Mars. Ingenuity extended the visual reach of the rover, rising into the thin sky and searching for ideal routes and interesting places to explore.

Other human capabilities could be extended robotically as well. Powered exoskeletons with extendable arms could help factory workers reach items on high shelves—a robotic version of the stretchy physicist Reed Richards from the Fantastic Four comics. At home, a simple, extendable robotic arm could be stashed in the closet and put to use to retrieve things that are hard to reach. This would be especially helpful for older individuals, letting them pick up items off the floor without having to strain their backs or test their balance.

The robotic arm is a somewhat obvious idea; other reach-extending devices could have unexpected shapes and forms. For instance, the relatively simple FLX Bot from FLX Solutions has a modular, snake-like body that’s only an inch thick, allowing it to access tight spaces, such as gaps behind walls; a vision system and intelligence enable it to choose its own path. The end of the robot can be equipped with a camera for inspecting impossible-to-reach places or a drill to make a hole for electrical wiring. The snakebot puts an intelligent spin on hammers and drills and functions as an extension of the human.

We can already pilot our eyes around corners and send them soaring off cliffs. But what if we could extend all of our senses to previously unreachable places? What if we could throw our sight, hearing, touch, and even sense of smell to distant locales and experience these places in a more visceral way? We could visit distant cities or far-off planets, and perhaps even infiltrate animal communities to learn more about their social organization and behavior.

For instance, I love to travel and experience the sights, sounds, and smells of a foreign city or landscape. I’d visit Paris once a week if I could, to walk the Champs-Elysées or the Jardins des Tuileries or enjoy the smells wafting out of a Parisian bakery. Nothing is ever as good as being there, of course, but we could use robots to approximate the experience of strolling through the famed city like a flâneur. Instead of merely donning a virtual-reality headset to immerse yourself in a digital world, you could use one of these devices, or something similar, to inhabit a distant robot in the actual world and experience that faraway place in an entirely new way.

Imagine mobile robots stationed throughout a city, like shareable motorized scooters or Citi Bikes. On a dreary day in Boston, I could switch on my headset, rent one of these robots, and remotely guide it through the Parisian neighborhood of my choice. The robot would have cameras to provide visual feedback and high-definition bidirectional microphones to capture sound. A much bigger challenge would be giving the robot the ability to smell its surroundings, perhaps taste the local food, and pass these sensations back to me. The human olfactory system uses 400 different types of smell receptors. A given scent might contain hundreds of chemical compounds and, when it passes through the nose, activate roughly 10% of these receptors. Our brains map this information onto a stored database of smells, and we can identify, say, a freshly baked croissant. Various research groups are using machine learning and advanced materials like graphene to replicate this approach in artificial systems. But maybe we should skip smell; the sights and sounds of Paris may suffice.

Extending our perceptual reach through intelligent robots also has more practical applications. One idea we explored in my lab is a robotic Mechanical Turk for physical work. Developed by an innovative Hungarian in the late 18th century, the original Mechanical Turk was a contraption that appeared to play chess. In reality, a human chess player disguised inside the so-called machine manipulated the pieces. In 2005, Amazon launched its own variation on the concept through a service that lets businesses hire remote individuals to carry out tasks that computers can’t yet do. We envisioned a combination of the two ideas, in which a human remotely (but not secretly) operates a robot, guiding the machine through tasks that it could not complete on its own—and jobs that are too dangerous or unhealthy for humans to do themselves.

The inspiration for this project stemmed in part from my visit to a cold storage facility outside Philadelphia. I donned all the clothing that warehouse workers wear, which made the temperature manageable in the main room. But in the deep freezer room, where temperatures can be -30 °C or even colder, I barely lasted 10 minutes. I was still chilled to the bone many hours later, after several car rides and a flight, and had to take a hot bath to return my core temperature to normal. People should not have to operate in such extreme environments. Yet robots cannot handle all the needed tasks on their own without making mistakes—there are too many different sizes and shapes in the environment, and too many items packed closely together.

What if we could throw our sight, hearing, touch, and even sense of smell to distant locales and experience these places in a more visceral way?

So we wondered what would happen if we were to tap into the worldwide community of gamers and use their skills in new ways. With a robot working inside the deep freezer room, or in a standard manufacturing or warehouse facility, remote operators could remain on call, waiting for it to ask for assistance if it made an error, got stuck, or otherwise found itself incapable of completing a task. A remote operator would enter a virtual control room that re-created the robot’s surroundings and predicament. This person would see the world through the robot’s eyes, effectively slipping into its body in that distant cold storage facility without being personally exposed to the frigid temperatures. Then the operator would intuitively guide the robot and help it complete the assigned task.

To validate our concept, we developed a system that allows people to remotely see the world through the eyes of a robot and perform a relatively simple task; then we tested it on people who weren’t exactly skilled gamers. In the lab, we set up a robot with manipulators, a stapler, wire, and a frame. The goal was to get the robot to staple wire to the frame. We used a humanoid, ambidextrous robot called Baxter, plus the Oculus VR system. Then we created an intermediate virtual room to put the human and the robot in the same system of coordinates—a shared simulated space. This let the human see the world from the point of view of the robot and control it naturally, using body motions. We demoed this system during a meeting in Washington, DC, where many participants—including some who’d never played a video game—were able to don the headset, see the virtual space, and control our Boston-based robot intuitively from 500 miles away to complete the task.

The best-known and perhaps most compelling examples of remote teleoperation and extended reach are the robots NASA has sent to Mars in the last few decades. My PhD student Marsette “Marty” Vona helped develop much of the software that made it easy for people on Earth to interact with these robots tens of millions of miles away. These intelligent machines are a perfect example of how robots and humans can work together to achieve the extraordinary. Machines are better at operating in inhospitable environments like Mars. Humans are better at higher-level decision-making. So we send increasingly advanced robots to Mars, and people like Marty build increasingly advanced software to help other scientists see and even feel the faraway planet through the eyes, tools, and sensors of the robots. Then human scientists ingest and analyze the gathered data and make critical creative decisions about what the rovers should explore next. The robots all but situate the scientists on Martian soil. They are not taking the place of actual human explorers; they’re doing reconnaissance work to clear a path for a human mission to Mars. Once our astronauts venture to the Red Planet, they will have a level of familiarity and expertise that would not be possible without the rover missions.

Robots can allow us to extend our perceptual reach into alien environments here on Earth, too. In 2007, European researchers led by J.L. Deneubourg described a novel experiment in which they developed autonomous robots that infiltrated and influenced a community of cockroaches. The relatively simple robots were able to sense the difference between light and dark environments and move to one or the other as the researchers wanted. The miniature machines didn’t look like cockroaches, but they did smell like them, because the scientists covered them with pheromones that were attractive to other cockroaches from the same clan.

The goal of the experiment was to better understand the insects’ social behavior. Generally, cockroaches prefer to cluster in dark environments with others of their kind. The preference for darkness makes sense—they’re less vulnerable to predators or disgusted humans when they’re hiding in the shadows. When the researchers instructed their pheromone-soaked machines to group together in the light, however, the other cockroaches followed. They chose the comfort of a group despite the danger of the light.

These robotic roaches bring me back to my first conversation with Roger Payne all those years ago, and his dreams of swimming alongside his majestic friends. What if we could build a robot that accomplished something similar to his imagined capsule? What if we could create a robotic fish that moved alongside marine creatures and mammals like a regular member of the aquatic neighborhood? That would give us a phenomenal window into undersea life.

Sneaking into and following aquatic communities to observe behaviors, swimming patterns, and creatures’ interactions with their habitats is difficult. Stationary observatories cannot follow fish. Humans can only stay underwater for so long.

Remotely operated and autonomous underwater vehicles typically rely on propellers or jet-based propulsion systems, and it’s hard to go unnoticed when your robot is kicking up so much turbulence. We wanted to create something different—a robot that actually swam like a fish. This project took us many years, as we had to develop new artificial muscles, soft skin, novel ways of controlling the robot, and an entirely new method of propulsion. I’ve been diving for decades, and I have yet to see a fish with a propeller. Our robot, SoFi (pronounced like Sophie), moves by swinging its tail back and forth like a shark. A dorsal fin and twin fins on either side of its body allow it to dive, ascend, and move through the water smoothly, and we’ve already shown that SoFi can navigate around other aquatic life forms without disrupting their behavior.

SoFi is about the size of an average snapper and has taken some lovely tours in and around coral reef communities in the Pacific Ocean at depths of up to 18 meters. Human divers can venture deeper, of course, but the presence of a scuba-diving human changes the behavior of the marine creatures. A few scientists remotely monitoring and occasionally steering SoFi cause no such disruption. By deploying one or several realistic robotic fish, scientists will be able to follow, record, monitor, and potentially interact with fish and marine mammals as if they were just members of the community.

Eventually we’d like to be able to extend the reach of our ears, too, into the seas. Along with my friends Rob Wood, David Gruber, and several other biologists and AI researchers, we are attempting to use machine learning and robotic instruments to record and then decode the language of sperm whales. We hope to be able to discover common fragments of whale vocalizations and, eventually, to identify sequences that may correspond to syllables or even concepts. Humans map sounds to words, which in turn correspond to concepts or things. Do whales communicate in a similar fashion? We aim to find out. If we extend our ears into the sea and leverage machine learning, perhaps someday we will even be able to communicate meaningfully with these fascinating creatures.

The knowledge yielded would be reward enough, but the impact could be much larger. One unexpected result of Roger’s discovery that whales sing and communicate was the “save the whales” movement. His scientific verification of their intelligence spurred a global effort to protect them. He hoped that learning more about the other species on our planet could have a similar effect. As Roger often pointed out, our survival as a species depends on the survival of our small and large neighbors on this planet. Biodiversity is part of what makes Earth a wonderful place for humans to live, and the more we can do to protect these other life forms, the better the chances that our planet continues to be a habitable environment for people in the centuries to come.

These examples of how we can pair the heart with the chip to extend our perceptual reach range from the whimsical to the profound. And the potential for other applications is vast. Environmental and government organizations tasked with protecting our landscapes could dispatch eyes to autonomously monitor land for illegal deforestation without putting people at risk. Remote workers could use robots to extend their hands into dangerous environments, manipulating or moving objects at hazardous nuclear sites. Scientists could peek or listen into the secret lives of the many amazing species on this planet. Or we could harness our efforts to find a way to remotely experience Paris or Tokyo or Tangier. The possibilities are endless and endlessly exciting. We just need effort, ingenuity, strategy, and the most precious resource of all.

No, not funding, although that is helpful.

We need time.

Excerpted from The Heart and the Chip: Our Bright Future with Robots. Copyright © 2024 by Daniela Rus and Gregory Mone. Used with permission of the publisher, W.W. Norton & Company. All rights reserved.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Will Douglas Heavenarchive page

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Casey Crownhartarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

The robots are coming. And that’s a good thing.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

OpenAI teases an amazing new generative video model called Sora

The problem with plug-in hybrids? Their drivers.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

OpenAI teases an amazing new generative video model called Sora

The problem with plug-in hybrids? Their drivers.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review