Skip to Content

How Computer Scientists Solved The Challenge of Zero-Maintenance Data Storage

Computer scientists have devised a way to make conventional memory discs 99.999 per cent reliable over a 4-year lifespan.

The cost of hard drive data storage has fallen dramatically in recent years. In 2008, this kind of memory cost around $0.11 per gigabyte. Today it costs around $0.04 and prices continue to drop. That’s having a significant impact on the way storage centres view their costs because the price of replacing disks when they fail is increasingly dominated by the cost of the service call itself.

Today, Jehan-François Pâris at the University of Houston in Texas and a few pals say they have developed a way to eliminate the cost of service calls by creating data storage that is so reliable that it would not require any human intervention throughout its whole lifetime.

Their trick for making zero maintenance data storage is to include enough spare discs to take on the data from any that fail. “We propose to reduce the maintenance cost of disks arrays by building self-repairing arrays that contain enough spare disks to operate without any human intervention during their whole lifetime,” they say. And the team has simulated the behaviour of such a system and say it outperforms current data redundancy systems.

The most commonly used data storage technology is called RAID (redundant array of independent disks). The most advanced incarnation, called level 6, consists of a set of data discs that are constantly checked against a smaller set of parity discs. That ensures that the data is redundant and can be recreated should the discs fail.

For example, a standard RAID architecture uses 4 parity discs to protect the contents of 6 data discs against all failures of up to 2 discs. Any new system would have to perform better than this.

Pâris and co have simulated the reliability of a memory system in which both parity discs and spare discs are included. The challenge is to achieve 99.999 percent reliability but with a reasonable increase in the amount of space needed to house spare discs (clearly, no datacentre could house an infinite number of spare discs).

The team had to make a number of assumptions about the reliability of discs in running the simulation. They make these assumptions based on the performance of 25,000 discs and their failure rates studied by the data storage company Backblaze.

This suggests that discs have a relatively high failure rate during the first 18 months, a much lower rate during the next 18 months and a high rate after that. They also assume that a disc repair takes 24 hours, given a disc transfer rate of around 200 MB/s.

Curiously, the simulations show that bigger is not necessarily better. They indicate that the best arrangement occurs when there are 45 data discs, 10 parity discs and 33 or 34 spare discs. This gives the best compromise between the extra space and the 99.99 percent reliability.

In fact, when the number of data discs is much larger than this, the system can never achieve 99.999 percent reliability because of the time it takes to recover data when a disk fails. Larger systems have a greater chance of another disk failing. And if that happens during the recovery period, the system can never catch up, even if it has an infinite number of spare discs.

The team go on to compare this performance with the standard RAID level 6 architecture and show that it is significantly better.

That is an interesting new take on storage reliability that could make data storage not just more reliable, but cheaper too. And with the price of magnetic storage set to fall even further, the importance of taking human interventions out of the loop is only likely to increase.

Ref: arxiv.org/abs/1501.00513 : Self-Repairing Disk Arrays

Keep Reading

Most Popular

wet market selling fish
wet market selling fish

This scientist now believes covid started in Wuhan’s wet market. Here’s why.

How a veteran virologist found fresh evidence to back up the theory that covid jumped from animals to humans in a notorious Chinese market—rather than emerged from a lab leak.

light and shadow on floor
light and shadow on floor

How Facebook and Google fund global misinformation

The tech giants are paying millions of dollars to the operators of clickbait pages, bankrolling the deterioration of information ecosystems around the world.

masked travellers at Heathrow airport
masked travellers at Heathrow airport

We still don’t know enough about the omicron variant to panic

The variant has caused alarm and immediate border shutdowns—but we still don't know how it will respond to vaccines.

egasus' fortune after macron hack
egasus' fortune after macron hack

NSO was about to sell hacking tools to France. Now it’s in crisis.

French officials were close to buying controversial surveillance tool Pegasus from NSO earlier this year. Now the US has sanctioned the Israeli company, and insiders say it’s on the ropes.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.