Intelligent Machines

Microsoft Reports a Big Leap Forward for DNA Data Storage

Microsoft says DNA could be a better way to store data for the long term than the magnetic tape companies rely on today.

It looks like a test tube with dried salt at the bottom, but Microsoft says it could be the future of data storage. The company reported today that it had written roughly 200 megabytes of data, including War and Peace and 99 other literary classics, into DNA.

Researchers have demonstrated that digital data can be stored in DNA before, but Microsoft says none have written so much of it into DNA at once.

DNA is a good storage medium because data can be written into molecules more densely than the basic elements of conventional storage technologies can pack it in, says Karin Strauss, Microsoft's lead researcher on the project, which also involves researchers from the University of Washington. Right now the technique is expensive and finicky, but the company hopes to piggyback on the plunging costs of tools for creating and reading out DNA driven by the biotech industry. DNA is seen as a potential replacement for magnetic tape, which is the standard mechanism for long-term data stores today.

“The company is interested in learning whether we can create an end-to-end system that can store information, that’s automated, and can be used for enterprise storage, based on DNA,” says Strauss.

The pink smear in this test tube is DNA that has been synthesized to store digital data for long-term storage. Microsoft used the same technique to store roughly 200 megabytes of data.

Strauss says the project is motivated by the fact that electronic storage devices are not improving as quickly as the amount of data we use grows. “If you look at current projections, we can’t store all the information we want with devices at the cost that they are,” she says.

IDC predicts that the worldwide total of stored digital data will hit 16 trillion gigabytes next year, most of it housed in huge data centers. Strauss estimates that a shoebox worth of DNA could hold the equivalent of roughly 100 giant data centers.

DNA can also be remarkably durable, particularly when kept cool and dry. In March, researchers announced that they had partially reconstructed the genomes of ancient humans whose bones had been in a Spanish cave for more than 400,000 years. In contrast, the magnetic tape that is the best long-term data storage option today lasts only a few decades before starting to degrade.

Storing data in DNA requires translating the 1s and 0s of binary digital files into long strings of the four different nucleotides, or bases, that make up DNA strands and write out the genetic code. In 2012, Harvard molecular biologist George Church wrote a 50,000-word book totaling less than a megabyte of data into DNA and printed it onto a glass chip smaller than a pollen grain. This year he reported having encoded 22 megabytes of digital data.

Microsoft says it has now written almost 10 times as much digital data into a collection of millions of pieces of DNA, each 150 bases long.

Reinhard Heckel, a postdoctoral researcher at University of California, Berkeley, who has worked on how to store data in DNA, calls that "impressive." But he says that the largest obstacle to making DNA data storage useful is the cost, because making custom DNA molecules is expensive. "For people to really pick it up, you need to store something cheaper than on tape, and that’s going to be hard," says Heckel.

Microsoft won’t disclose details of what it spent to make its 200-megabyte DNA data store, which required about 1.5 billion bases. But Twist Bioscience, which synthesized the DNA, typically charges 10 cents for each base. Commercially available synthesis can cost as little as .04 cents per base. Reading out a million bases costs roughly a penny.

Strauss is confident that the costs of reading and writing DNA will plunge significantly in coming years. She says there is already evidence that they are falling faster than the cost of fabricating transistors did over the past 50 years, a trend that has been the engine of much innovation in computing.

It would have cost about $10 million to sequence a human genome in 2007 but close to only $1,000 in 2015.

Cut off? Read unlimited articles today.

Become an Insider
Already an Insider? Log in.
The pink smear in this test tube is DNA that has been synthesized to store digital data for long-term storage. Microsoft used the same technique to store roughly 200 megabytes of data.

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus ad-free web experience, select discounts to partner offerings and MIT Technology Review events

    See details+

    What's Included

    Bimonthly magazine delivery and unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.