The cost of sequencing human genomes is plunging—in the most advanced genomics centers, it’s falling five times faster than the cost of computing. Increasingly, people are getting their DNA sequenced by companies and research labs in a search for clues about genetic variation and disease.
But the industry must figure out how to cheaply store all the resulting data. Each of the 3.2 billion DNA base pairs in a human genome can be encoded by two bits—800 megabytes for the entire genome. But considerable data about each base is usually collected, and genes are often sequenced many times to ensure accuracy, so it’s common to save around 100 gigabytes when sequencing a human genome with a machine made by industry leader Illumina. Keeping this much data about every person on the planet would require about as much digital storage as was available in the whole world in 2010.
The trick, then, will be to save less. Harvard geneticist George Church says that eventually only the differences between a newly sequenced genome and a reference genome will need to be stored. That information could be encoded in as little as four megabytes. Then your genome might be just another e-mail attachment.
Information graphics by Infographics.com
Here’s how a Twitter engineer says it will break in the coming weeks
One insider says the company’s current staffing isn’t able to sustain the platform.
Technology that lets us “speak” to our dead relatives has arrived. Are we ready?
Digital clones of the people we love could forever change how we grieve.
How to befriend a crow
I watched a bunch of crows on TikTok and now I'm trying to connect with some local birds.
Starlink signals can be reverse-engineered to work like GPS—whether SpaceX likes it or not
Elon said no thanks to using his mega-constellation for navigation. Researchers went ahead anyway.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.