Skip to Content

Basking in Big Data

Visualization software makes viewing and interacting with enormous data sets practical without a supercomputer.
January 16, 2009

In some ways, science is suffering from too much data. Experiments and computer simulations analyzing everything from the dynamics of climate change to the precise details of folding proteins can churn out billions of numbers describing these physical phenomena. Making sense of all this data remains a challenge.

Data extraction: This image shows an experiment in which aerogel, a porous material, is bombarded by a micrometeroid traveling at five kilometers per second. Aerogels are commonly used to shield electronic equipment in satellites because they are both durable and extremely light. The Morse-Smale complex identifies the structure of the porous solid as the micrometeroid enters it, providing detailed information about the filament structure of the material (shown at right).

Recently, however, researchers at the University of California, Davis, and Lawrence Livermore National Laboratory announced that they have developed software that makes analysis and visualization of huge data sets possible without the aid of a supercomputer. The researchers’ algorithm slices up data into more manageable chunks, then stitches it back together on the fly, so that the data can be manipulated in three dimensions, all on a computer with the power and capacity of a high-end laptop.

The team’s algorithm offers a practical way to get structural information about materials, proteins, and fluids, says Attila Gyulassy, the researcher at UC Davis who led the project. It allows users to “interactively visualize, rotate, apply different transfer functions, and highlight different aspects of the data,” he says.

See the photo gallery here.

The software employs a mathematical tool called the Morse-Smale complex, which has been used for around 4 years to extract and visualize elements of large data sets by sorting them into segments that contain mathematically similar features. But while the Morse-Smale complex has been known for decades, it normally requires huge amounts of memory to perform the necessary calculations on a computer.

Gyulassy and his colleagues found a solution to this memory problem by writing an algorithm that breaks apart a data set before using the Morse-Smale complex, then stitches the blocks back together again. This means that only a small amount of data is needed at each step, so much less has to be stored in memory. As a result, the software can run on a desktop computer with just two gigabytes of memory.

Memory is one of the big limiting factors when trying to perform complex analysis of large data sets, says Peter Schröder, a professor of computer science at California Institute of Technology, in Pasadena. “You can’t even fit the stuff in memory,” he says. “But [the researchers] have addressed it.”

Schröder adds that, while the new software isn’t the only data-visualization tool available, it looks particularly powerful and practical for a number of scientific applications. Algorithms such as this are changing science, he adds: “Things that used to be considered too abstract or too crazy to use for data analysis are turning not just into algorithms, but practical algorithms.”

Gyulassy says that his team has plans to release an open-source software library by the end of March so that other researchers can take advantage of the approach, and modify it to suit their needs.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.