Skip to Content
Uncategorized

The Hidden Risk of a Meltdown in the Cloud

The cloud could suffer the same kind of collapses that plague the financial system, according to an analysis of the unrecognised risks of cloud computing.

The cloud is essentially a metaphor for a network of computers in which computational tasks and resources can be shared. 

The big idea here is that users simply rent the computing power, the storage or an application for as long as they need it without having to invest in the infrastructure behind it. 

That makes computing cheaper, easier and more efficient. 

There are well known problems of course. The most obvious relates to guaranteeing the security of data when it is stored on computers that that a user does not own and that many others can also access. But various solutions have emerged such as encrypting data before it is sent to the cloud. For that reason, the migration to the cloud is proceeding at full speed in many places. 

That may be folly. Today, Bryan Ford at Yale University in New Haven says that the full risks of this migration have yet to be properly explored. He points out that complex systems can fail in many unexpected ways and outlines various simple scenarios in which a cloud could come unstuck. 

In the worst case scenario, a cloud could experience a full meltdown that could seriously threaten any business that relies on it.   

Ford identifies a number of different possibilities. One example involves an application provider who bases its services in the cloud, such as a cloud -based advertising service.

He imagines a simple scenario in which the cloud operator distributes the service between two virtual servers, using a power balancing program to switch the load from one server to the other as conditions demand. 

However, the application provider may also have a load balancing program that distributes the customer load.

Now Ford imagines the scenario in which both load balancing programs operate with the same refresh period, say once a minute. When these periods coincide, the control loops start sending the load back and forth between the virtual servers in a positive feedback loop. 

“The two controllers each compensate with a stronger action causing a larger swing the next minute,” says Ford. Clearly, this is a process that must eventually spiral out of control and crash the system.

Ford is careful to put the risk in context: “This simplistic example might be unlikely to occur in exactly this form on real systems—or might be quickly detected and “fixed” during development and testing—but it suggests a general risk,” he says.

In fact, this kind of coupling between an application provider and the infrastructure provider is inevitable, particularly when the cloud provider’s system is opaque so that external users cannot see the internal control loop cycles and so avoid them, 

“Non-transparent layering structures…may create unexpected and potentially catastrophic failure correlations, reminiscent of financial industry crashes,” he says.

But the lack of transparency is only part of the story. A more general risk arises when systems are complex because seemingly unrelated parts can become coupled in unexpected ways.  

A growing number of complexity theorists are beginning to recognise this problem. The growing consensus is that bizarre and unpredictable  behaviour often emerges in systems made up of “networks of networks”.

An obvious example is the flash crashes that now plague many financial markets in which prices plummet dramatically for no apparent reason.  Understanding how and why this happens is the focus of much research.

Given that cloud is clearly becoming a network of networks that is rapidly growing in complexity, it’s not hard to imagine that the computing equivalent of flash crashes are not just likely but inevitable.

Of course, it would be easy for cloud providers to say that their systems are carefully designed and monitored and entirely risk free in this respect. That would be an understandable knee jerk reaction from a PR department. 

But it ought to be a worrying sign for any customer, indicating that the providers simply do not understand the problem, let alone have a solution for it. 

Ford concludes with the following:  “We should study [these unrecognised risks] before our socioeconomic fabric becomes inextricably dependent on a convenient but potentially unstable computing model.”

Clearly, an eminently sensible suggestion.

Ref: arxiv.org/abs/1203.1979: Icebergs in the Clouds: the Other Risks of Cloud Computing

Deep Dive

Uncategorized

Our best illustrations of 2022

Our artists’ thought-provoking, playful creations bring our stories to life, often saying more with an image than words ever could.

How CRISPR is making farmed animals bigger, stronger, and healthier

These gene-edited fish, pigs, and other animals could soon be on the menu.

The Download: the Saudi sci-fi megacity, and sleeping babies’ brains

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. These exclusive satellite images show Saudi Arabia’s sci-fi megacity is well underway In early 2021, Crown Prince Mohammed bin Salman of Saudi Arabia announced The Line: a “civilizational revolution” that would house up…

10 Breakthrough Technologies 2023

Every year, we pick the 10 technologies that matter the most right now. We look for advances that will have a big impact on our lives and break down why they matter.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.