Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Rewriting Life

Three Questions for J. Craig Venter

Gene research and Silicon Valley-style computing are starting to merge.

The number of human genomes being sequenced is increasing exponentially.

Genome scientist and entrepreneur J. Craig Venter is best known for being the first person to sequence his own genome, back in 2001.

This year, he started a new company, Human Longevity, which intends to sequence one million human genomes by 2020, and ultimately offer Web-based programs to help people store and understand their genetic data (see “Microbes and Metabolites Fuel an Ambitious Aging Project”).

J. Craig Venter
J. Craig Venter

Venter says that he’s sequenced 500 people’s genomes so far, and that volunteers are starting to also undergo a battery of tests measuring their strength, brain size, how much blood their hearts pump, and, says Venter, “just about everything that can be measured about a person, without cutting them open.” This information will be fed into a database that can be used to discover links between genes and these traits, as well as disease.

But that’s going to require some massive data crunching. To get these skills, Venter recruited Franz Och, the machine-learning specialist leading Google Translate. Now Och will apply similar methods to studying genomes in a data science and software shop that Venter is establishing in Mountain View, California.

The hire comes just as Google itself has launched a similar-sounding effort to start collecting biomedical data (see “What’s a Moon Shot Worth These Days”). Venter calls Google’s plans for a biomedical database “a baby step, a much smaller version of what we are doing.”

What’s clear is that genome research and data science are coming together in new ways, and at a much larger scale than ever before. We asked Venter why.

How are we doing in genomics?

In my view there have not been a significant number of advances. One reason for that is that genomics follows a law of very big numbers. I’ve had my genome for 15 years, and there’s not much I can learn because there are not that many others to compare it to.

Why did you hire an expert in machine translation as your top data scientist?

Until now, there’s not been software for comparing my genome to your genome, much less to a million genomes. We want to get to a point where it takes a few seconds to compare your genome to all the others. It’s going to take a lot of work to do that.

Google Translate started as a slow algorithm that took hours or days to run and was not very accurate. But Franz [Och] built a machine-learning version that could go out on the Web and find every article translated from German to English or vice versa, and learn from those. And then it was optimized, so it works in milliseconds.

I convinced Franz, and he convinced himself, that understanding the human genome at the scale that we are trying to do it is going to be one of the greatest translation challenges in history. 

How is discovering the connection between genes and disease like translating languages?

Everything in a cell derives from your DNA code, all the proteins, their structure, whether they last seconds or days. All that is preprogrammed in DNA language. Then it is translated into life. People are going to be very surprised about how much of a DNA software species we are. 

Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.

Subscribe today

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Rewriting Life

Reprogramming our bodies to make us healthier.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.