Skip to Content

Bases to Bytes

Cheap sequencing technology is flooding the world with genomic data. Can we handle the deluge?
April 25, 2012

The cost of sequencing human genomes is plunging—in the most advanced genomics centers, it’s falling five times faster than the cost of computing. Increasingly, people are getting their DNA sequenced by companies and research labs in a search for clues about genetic variation and disease.

But the industry must figure out how to cheaply store all the resulting data. Each of the 3.2 billion DNA base pairs in a human genome can be encoded by two bits—800 megabytes for the entire genome. But considerable data about each base is usually collected, and genes are often sequenced many times to ensure accuracy, so it’s common to save around 100 gigabytes when sequencing a human genome with a machine made by industry leader Illumina. Keeping this much data about every person on the planet would require about as much digital storage as was available in the whole world in 2010.

The trick, then, will be to save less. Harvard geneticist George Church says that eventually only the differences between a newly sequenced genome and a reference genome will need to be stored. That information could be encoded in as little as four megabytes. Then your genome might be just another e-mail attachment.

Information graphics by Infographics.com

Keep Reading

Most Popular

Rendering of Waterfront Toronto project
Rendering of Waterfront Toronto project

Toronto wants to kill the smart city forever

The city wants to get right what Sidewalk Labs got so wrong.

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

Yann LeCun
Yann LeCun

Yann LeCun has a bold new vision for the future of AI

One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.