Skip to Content
Artificial intelligence

Google Has Released an AI Tool That Makes Sense of Your Genome

AI tools could help us turn information gleaned from genetic sequencing into life-saving therapies.
December 4, 2017
Brendan Monroe

Almost 15 years after scientists first sequenced the human genome, making sense of the enormous amount of data that encodes human life remains a formidable challenge. But it is also precisely the sort of problem that machine learning excels at.

On Monday, Google released a tool called DeepVariant that uses the latest AI techniques to build a more accurate picture of a person’s genome from sequencing data.

DeepVariant helps turn high-throughput sequencing readouts into a picture of a full genome. It automatically identifies small insertion and deletion mutations and single-base-pair mutations in sequencing data.

High-throughput sequencing became widely available in the 2000s and has made genome sequencing more accessible. But the data produced using such systems has offered only a limited, error-prone snapshot of a full genome. It is typically challenging for scientists to distinguish small mutations from random errors generated during the sequencing process, especially in repetitive portions of a genome. These mutations may be directly relevant to diseases such as cancer.

A number of tools exist for interpreting these readouts, including GATK, VarDict, and FreeBayes. However, these software programs typically use simpler statistical and machine-learning approaches to identifying mutations by attempting to rule out read errors.

“One of the challenges is in difficult parts of the genome, where each of the [tools] has strengths and weaknesses,” says Brad Chapman, a research scientist at Harvard’s School of Public Health who tested an early version of DeepVariant. “These difficult regions are increasingly important for clinical sequencing, and it’s important to have multiple methods.”

DeepVariant was developed by researchers from the Google Brain team, a group that focuses on developing and applying AI techniques, and Verily, another Alphabet subsidiary that is focused on the life sciences.

The team collected millions of high-throughput reads and fully sequenced genomes from the Genome in a Bottle (GIAB)  project, a public-private effort to promote genomic sequencing tools and techniques. They fed the data to a deep-learning system and painstakingly tweaked the parameters of the model until it learned to interpret sequenced data with a high level of accuracy.

Last year, DeepVariant won first place in the PrecisionFDA Truth Challenge, a contest run by the FDA to promote more accurate genetic sequencing.

“The success of DeepVariant is important because it demonstrates that in genomics, deep learning can be used to automatically train systems that perform better than complicated hand-engineered systems,” says Brendan Frey, CEO of Deep Genomics.

The release of DeepVariant is the latest sign that machine learning may be poised to boost progress in genomics.

Deep Genomics is one of several companies trying to use AI approaches such as deep learning to tease out genetic causes of diseases and to identify potential drug therapies (see “An AI-Driven Genomics Company Is Turning to Drugs”).

Frey says AI will eventually go well beyond helping to sequence genomic data. “The gap that is currently blocking medicine right now is in our inability to accurately map genetic variants to disease mechanisms and to use that knowledge to rapidly identify life-saving therapies,” he says.

Another prominent company in this area is Wuxi Nextcode, which has offices in Shanghai, Reykjavik, and Cambridge, Massachusetts. Wuxi Nextcode has amassed the world’s largest collection of fully sequenced human genomes, and the company is investing heavily in machine-learning methods.

DeepVariant will also be available on the Google Cloud Platform. Google and its competitors are furiously adding machine-learning features to their cloud platforms in an effort to lure anyone who might want to tap into the latest AI techniques (see “Ambient AI Is About to Devour the Software Industry”).

In general, AI figures to help many aspects of medicine take big leaps forward in the coming years. There are opportunities to mine many different kinds of medical data—from images or medical records, for example— to predict ailments that a human doctor might miss (see “The Machines Are Getting Ready to Play Doctor” and “A New Algorithm for Palliative Care”).

But genomic medicine represents an especially big  opportunity, because the scale and complexity of the data is unprecedented. “For the first time in history, our ability to measure our biology, and even to act on it, has far surpassed our ability to understand it,” says Frey. “The only technology we have for interpreting and acting on these vast amounts of data is AI. That’s going to completely change the future of medicine.”

Deep Dive

Artificial intelligence

Sam Altman says helpful agents are poised to become AI’s killer function

Open AI’s CEO says we won’t need new hardware or lots more training data to get there.

An AI startup made a hyperrealistic deepfake of me that’s so good it’s scary

Synthesia's new technology is impressive but raises big questions about a world where we increasingly can’t tell what’s real.

Is robotics about to have its own ChatGPT moment?

Researchers are using generative AI and other techniques to teach robots new skills—including tasks they could perform in homes.

Taking AI to the next level in manufacturing

Reducing data, talent, and organizational barriers to achieve scale.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.