Basking in Big Data
In some ways, science is suffering from too much data. Experiments and computer simulations analyzing everything from the dynamics of climate change to the precise details of folding proteins can churn out billions of numbers describing these physical phenomena. Making sense of all this data remains a challenge.
Recently, however, researchers at the University of California, Davis, and Lawrence Livermore National Laboratory announced that they have developed software that makes analysis and visualization of huge data sets possible without the aid of a supercomputer. The researchers’ algorithm slices up data into more manageable chunks, then stitches it back together on the fly, so that the data can be manipulated in three dimensions, all on a computer with the power and capacity of a high-end laptop.
The team’s algorithm offers a practical way to get structural information about materials, proteins, and fluids, says Attila Gyulassy, the researcher at UC Davis who led the project. It allows users to “interactively visualize, rotate, apply different transfer functions, and highlight different aspects of the data,” he says.
See the photo gallery here.
The software employs a mathematical tool called the Morse-Smale complex, which has been used for around 4 years to extract and visualize elements of large data sets by sorting them into segments that contain mathematically similar features. But while the Morse-Smale complex has been known for decades, it normally requires huge amounts of memory to perform the necessary calculations on a computer.
Gyulassy and his colleagues found a solution to this memory problem by writing an algorithm that breaks apart a data set before using the Morse-Smale complex, then stitches the blocks back together again. This means that only a small amount of data is needed at each step, so much less has to be stored in memory. As a result, the software can run on a desktop computer with just two gigabytes of memory.
Memory is one of the big limiting factors when trying to perform complex analysis of large data sets, says Peter Schröder, a professor of computer science at California Institute of Technology, in Pasadena. “You can’t even fit the stuff in memory,” he says. “But [the researchers] have addressed it.”
Schröder adds that, while the new software isn’t the only data-visualization tool available, it looks particularly powerful and practical for a number of scientific applications. Algorithms such as this are changing science, he adds: “Things that used to be considered too abstract or too crazy to use for data analysis are turning not just into algorithms, but practical algorithms.”
Gyulassy says that his team has plans to release an open-source software library by the end of March so that other researchers can take advantage of the approach, and modify it to suit their needs.
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
ChatGPT is going to change education, not destroy it
The narrative around cheating students doesn’t tell the whole story. Meet the teachers who think generative AI could actually make learning better.
Meet the people who use Notion to plan their whole lives
The workplace tool’s appeal extends far beyond organizing work projects. Many users find it’s just as useful for managing their free time.
Learning to code isn’t enough
Historically, learn-to-code efforts have provided opportunities for the few, but new efforts are aiming to be inclusive.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.