AI is dreaming up drugs that no one has ever seen. Now we’ve got to see if they work.
AI automation throughout the drug development pipeline is opening up the possibility of faster, cheaper pharmaceuticals.
At 82 years old, with an aggressive form of blood cancer that six courses of chemotherapy had failed to eliminate, Paul appeared to be out of options. With each long and unpleasant round of treatment, his doctors had been working their way down a list of common cancer drugs, hoping to hit on something that would prove effective—and crossing them off one by one. The usual cancer killers were not doing their job.
With nothing to lose, Paul’s doctors enrolled him in a trial set up by the Medical University of Vienna in Austria, where he lives. (Paul's real name is not known because his identify was obscured in the trial.) The university was testing a new matchmaking technology developed by a UK-based company called Exscientia that pairs individual patients with the precise drugs they need, taking into account the subtle biological differences between people.
The researchers took a small sample of tissue from Paul. They divided the sample, which included both normal cells and cancer cells, into more than a hundred pieces and exposed them to various cocktails of drugs. Then, using robotic automation and computer vision (machine-learning models trained to identify small changes in cells), they watched to see what would happen.
In effect, the researchers were doing what the doctors had done: trying different drugs to see what worked. But instead of putting a patient through multiple months-long courses of chemotherapy, they were testing dozens of treatments all at the same time.
The approach allowed the team to carry out an exhaustive search for the right drug. Some of the medicines didn’t kill Paul’s cancer cells. Others harmed his healthy cells. Paul was too frail to take the drug that came out on top. So he was given the runner-up in the matchmaking process: a cancer drug marketed by the pharma giant Johnson & Johnson that Paul’s doctors had not tried because previous trials had suggested it was not effective at treating his type of cancer.
It worked. Two years on, Paul was in complete remission—his cancer was gone. The approach is a big change for the treatment of cancer, says Exscientia’s CEO, Andrew Hopkins: “The technology we have to test drugs in the clinic really does translate to real patients.”
Selecting the right drug is just half the problem that Exscientia wants to solve. The company is set on overhauling the entire drug development pipeline. In addition to pairing patients up with existing drugs, Exscientia is using machine learning to design new ones. This could in turn yield even more options to sift through when looking for a match.
The first drugs designed with the help of AI are now in clinical trials, the rigorous tests done on human volunteers to see if a treatment is safe—and really works—before regulators clear them for widespread use. Since 2021, two drugs that Exscientia developed (or co-developed with other pharma companies) have started the process. The company is on the way to submitting two more.
“If we were using a traditional approach, we couldn’t have scaled this fast,” Hopkins says.
Exscientia isn’t alone. There are now hundreds of startups exploring the use of machine learning in the pharmaceutical industry, says Nathan Benaich at Air Street Capital, a VC firm that invests in biotech and life sciences companies: “Early signs were exciting enough to attract big money.”
Today, on average, it takes more than 10 years and billions of dollars to develop a new drug. The vision is to use AI to make drug discovery faster and cheaper. By predicting how potential drugs might behave in the body and discarding dead-end compounds before they leave the computer, machine-learning models can cut down on the need for painstaking lab work.
And there is always a need for new drugs, says Adityo Prakash, CEO of the California-based drug company Verseon: “There are still too many diseases we can’t treat or can only treat with three-mile-long lists of side effects.”
Now, new labs are being built around the world. Last year Exscientia opened a new research center in Vienna; in February, Insilico Medicine, a drug discovery firm based in Hong Kong, opened a large new lab in Abu Dhabi. All told, around two dozen drugs (and counting) that were developed with the assistance of AI are now in or entering clinical trials.
“If somebody tells you they can perfectly predict which drug molecule can get through the gut … they probably also have land to sell you on Mars.”
Adityo Prakash, CEO of Verseon
We’re seeing this uptick in activity and investment because increasing automation in the pharmaceutical industry has started to produce enough chemical and biological data to train good machine-learning models, explains Sean McClain, founder and CEO of Absci, a firm based in Vancouver, Washington, that uses AI to search through billions of potential drug designs. “Now is the time,” McClain says. “We’re going to see huge transformation in this industry over the next five years.”
Yet it is still early days for AI drug discovery. There are a lot of AI companies making claims they can’t back up, says Prakash: “If somebody tells you they can perfectly predict which drug molecule can get through the gut or not get broken up by the liver, things like that, they probably also have land to sell you on Mars.”
And the technology is not a panacea: experiments on cells and tissues in the lab and tests in humans—the slowest and most expensive parts of the development process—cannot be cut out entirely. “It’s saving us a lot of time. It’s already doing a lot of the steps that we used to do by hand,” says Luisa Salter-Cid, chief scientific officer at Pioneering Medicines, part of the startup incubator Flagship Pioneering in Cambridge, Massachusetts. “But the ultimate validation needs to be done in the lab.” Still, AI is already changing how drugs are being made. It could be a few years yet before the first drugs designed with the help of AI hit the market, but the technology is set to shake up the pharma industry, from the earliest stages of drug design to the final approval process.
The basic steps involved in developing a new drug from scratch haven’t changed much. First, pick a target in the body that the drug will interact with, such as a protein; then design a molecule that will do something to that target, such as change how it works or shut it down. Next, make that molecule in a lab and check that it actually does what it was designed to do (and nothing else); and finally, test it in humans to see if it is both safe and effective.
For decades chemists have screened candidate drugs by putting samples of the desired target into lots of little compartments in a lab, adding different molecules, and watching for a reaction. Then they repeat this process many times, tweaking the structure of the candidate drug molecules—swapping out this atom for that one—and so on. Automation has sped things up, but the core process of trial and error is unavoidable.
But test tubes are not bodies. Many drug molecules that appear to do their job in the lab end up failing when they are eventually tested in people. “The whole process of drug discovery is about failure,” says biologist Richard Law, chief business officer at Exscientia. “The reason that the cost of coming up with a drug is so high is because you have to design and test 20 drugs to get one to work.”
This new generation of AI companies is focusing on three key failure points in the drug development pipeline: picking the right target in the body, designing the right molecule to interact with it, and determining which patients that molecule is most likely to help.
Computational techniques like molecular modeling have been reshaping the drug development pipeline for decades. But even the most powerful approaches have involved building models by hand, a process that is slow, hard, and liable to yield simulations that diverge from real-world conditions. With machine learning, vast amounts of data, including drug and molecular data, can be harnessed to build complex models automatically. This makes it far easier—and faster—to predict how drugs might behave in the body, allowing many early experiments to be carried out in silico. Machine-learning models can also sift through vast, untapped pools of potential drug molecules in a way that was not previously possible. The upshot is that the hard, but essential, work in laboratories (and later in clinical trials) need only be carried out on those molecules with the best chances of success.
Before they even get to simulating drug behavior, many companies are applying machine learning to the problem of identifying targets. Exscientia and others use natural-language processing to mine data from vast archives of scientific reports going back decades, including hundreds of thousands of published gene sequences and millions of academic papers. The information extracted from these documents is encoded in knowledge graphs—a way to organize data that captures links including causal relationships such as “A causes B.” Machine-learning models can then predict which targets might be the most promising ones to focus on in trying to treat a particular disease.
Applying natural-language processing to data mining is not new, but pharmaceutical companies, including the bigger players, are now making it a key part of their process, hoping it can help them find connections that humans might have missed.
Jim Weatherall, vice president of data science and AI at AstraZeneca, says that getting AI to crawl through lots of biomedical data has helped him and his team find a few drug targets they would not otherwise have considered. “It’s made a real difference,” he says. “No human is going to read millions of biology papers.” Weatherall says the technique has revealed connections between things that might seem unrelated, such as a recent finding and a forgotten result from 10 years ago. “Our biologists then go and look at that and see if it makes sense,” says Weatherall. It’s still early days for this target-identification technique, though. He says it will be “some years” before any AstraZeneca drugs that result from it go into clinical trials.
But picking a target is just the start. The bigger challenge is designing a drug molecule that will do something with it—and this is where most innovation is happening.
The interaction between molecules inside a body is vastly complicated. Many drugs have to pass through hostile environments, such as the gut, before they can do their job. And everything is governed by physical and chemical laws that operate at atomic scales. The goal of most AI-powered approaches to drug design is to navigate the vast possibilities and quickly home in on new molecules that tick as many boxes as possible.
Generate Biomedicines, a startup based in Cambridge, Massachusetts, founded by Flagship Pioneering, is aiming to do that using the same kind of generative AI behind text-to-image software like DALL-E 2. Instead of manipulating pixels, Generate’s software works with random strands of amino acids and finds ways to twist them up into protein structures with specific properties. Since the functions of a protein are dictated by its 3D folding, this, in effect, makes it possible to order up a protein capable of doing a particular job. (Other groups, including David Baker’s lab at the University of Washington, are developing similar tech.)
“Patients can have this terrible experience of going in and out of hospital, sometimes for years, getting drugs that don’t work.”
Richard Law, chief business officer of Exscientia
Absci is also trying to create new protein-based drugs using machine learning, but through a different approach. The company takes existing antibodies—proteins that the immune system uses to remove bacteria, viruses, and other unwanted assailants—and uses models trained on data from lab experiments to come up with lots of new designs for the parts of those antibodies that glom onto foreign matter. The idea is to redesign existing antibodies to make them better at binding to targets. After making adjustments in simulation, the researchers then synthesize and test the designs that work best.
In January, Absci, which has partnerships with larger pharmaceutical companies such as Merck, announced that it had used its approach to redesign several existing antibodies, including one that targets the spike protein of SARS-CoV-2, the virus that causes covid-19, and another that blocks a type of protein that helps cancer cells grow.
Apriori Bio, another Flagship Pioneering startup based in Cambridge, also has its eye on covid, hoping in particular to develop vaccines capable of protecting people from a wide range of viral variants. The company builds millions of variants in the lab and tests how well covid-fighting antibodies grab onto them. It then uses machine learning to predict how the best antibodies would fare against 100 billion billion (1020) more variants. The goal is to take the most promising antibodies—the ones that seem able to take on a large range of variants or might combat particular variants of concern—and use them to design variant-proof vaccines.
“It’s just not viable to ever do this experimentally,” says Lovisa Afzelius, a partner at Flagship Pioneering and CEO of Apriori Bio. “There is no way that your human brain can put all those bits and pieces in place and figure out that entire system.”
For Prakash, this is where AI’s real potential lies: opening up a huge untapped pool of biological and chemical structures that could become the ingredients of future drugs. Once you strip out very similar molecules, Prakash says, all of Big Pharma taken together—Merck, Novartis, AstraZeneca, and so on—has an ingredient list of at most 10 million molecules to build drugs from, some proprietary and some commonly known. “That’s what we’re testing across the entire planet—the total product of the last hundred years of toil from a lot of chemists,” he says.
And yet, he says, the number of possible molecules that might make drugs, according to the rules of organic chemistry, is 1033 (other estimates have put the number of drug-like molecules even higher, in the realm of 1060). “Compare that number to 10 million and you see we’re not even fishing in a tide pool next to the ocean,” Prakash says. “We’re fishing in a droplet.”
Like others, Prakash’s company, Verseon, is using both old and new computational techniques to survey this ocean, generating millions of possible molecules and testing their properties. Verseon treats the interaction between drugs and proteins in the body as a physics problem, simulating the push and pull between atoms that influences how molecules fit together. Such molecular simulations are not new, but Verseon uses AI to more accurately model how molecules interact. So far, the company has produced 16 candidate drugs for a range of diseases, including cardiovascular conditions, infectious diseases, and cancer. One of those drugs is in clinical trials, and trials for several others are set to begin soon.
Crucially, simulation allows researchers to zip past a lot of the messiness that generally characterizes the drug design process. Companies traditionally create batches of molecules they hope have certain properties and then test each in turn. With machine learning, they can instead start with a wish list of basic characteristics—encoded mathematically—and produce designs for molecules that have those properties at the push of a button. This flips the early phase of development on its head, says Salter-Cid: “It’s not something we used to be able to do at the beginning.” A company might ordinarily make 2,500 to 5,000 compounds over five years when developing a new drug. Exscientia made 136 for one of its new cancer drugs, in just one year.
“It’s about speeding up cycles of exploration,” says Weatherall. “We’re getting to the stage now where we can make more and more decisions without actually having to make a molecule for real.”
However they are made, drugs still have to be tested in humans. These final phases of drug development, which involve recruiting large numbers of volunteers, are hard to run and generally take a long time—around 10 years on average and sometimes up to 20. Many drugs take years to get to this stage and still fail.
AI won’t be able to speed the clinical trial process, but it could help drug companies stack the odds more in their favor, by cutting down the time and cost involved in searching for new drug candidates. Less time spent testing dead-end drug molecules in the lab should mean that promising candidates will make it to clinical trials faster. And with less money on the line, companies might not feel as much pressure to stick with a drug that isn’t performing particularly well.
Better targeting of patients could also help improve the process. Most clinical trials measure the average effect of a medicine, tallying up how many people it worked for and how many it didn’t. If enough people in the trial see an improvement in their condition, then the drug is considered successful. If the drug isn’t effective for a large enough percentage, then it’s a failure. But this can mean that small groups of people for whom a drug worked get overlooked.
“It’s a very crude way of doing it,” says Weatherall. “What we’d actually like to do is find the subset of patients who would get the most benefit from a drug.”
This is where Exscientia’s matchmaking technology comes in. “If we can select the right patients, it does fundamentally change the economic model of the pharma industry,” says Hopkins.
It will all also dramatically improve the lives of patients, like Paul, who do not respond to the most common drugs. “Patients can have this terrible experience of going in and out of hospital, sometimes for years, getting drugs that don’t work, until either there’s no drugs left anymore or they finally get to the one that does work for them,” says Law.
After Exscientia found a drug that worked for Paul, the company followed up with a scientific study. It took tissue samples from dozens of cancer patients who had undergone at least two failed courses of chemotherapy and evaluated the effects of 139 existing drugs on their cells. Exscientia was able to identify a drug that worked for more than half of them.
The company now wants to use this technology to shape its approach to drug development, incorporating patient data into the earliest stages of the process to train even better AI. “Instead of starting with a model of a disease, we can start with tissue from a patient,” says Hopkins. “The patient is the best model.”
For now, the first batch of AI-designed drugs is still making its way through the clinical trial gauntlet. It could be months, or even years, before the first ones pass and hit the market. Some may not make it.
But even if this initial group fails, there will be another. Drug design has changed forever. “These are just the first drugs that these companies are trying,” says Benaich. “Their best drugs might be the ones that come after.”
Deep Dive
Artificial intelligence
Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
OpenAI teases an amazing new generative video model called Sora
The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.
Google DeepMind’s new generative model makes Super Mario–like games from scratch
Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.
Responsible technology use in the AI age
AI presents distinct social and ethical challenges, but its sudden rise presents a singular opportunity for responsible adoption.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.