Skip to Content



A new paradigm for managing data

Open data lakehouse architectures speed insights and deliver self-service analytics capabilities.

In partnership withDremio

Regeneron Pharmaceuticals, a biotechnology company that develops life-transforming medicines, found itself inundated with vast volumes of data during the peak of the covid-19 pandemic. In order to derive actionable information from these disparate data sets, which ranged from clinical trial data to real-time supply chain information, the company needed new ways to join and relate them, regardless of what format they were in or where they came from.

Stock graphic of an outward spiraling lines, dots, and rectangles

A new paradigm for managing data

Shah Nawaz, chief technology officer and vice president of digital technology and engineering at Regeneron, says, “At the time, everybody in the world was reporting on their covid-19 findings from different countries and in different languages.” The challenge was how to make sense of these massive data sets in a timely manner, assisting researchers and clinicians, and ultimately getting the best treatments to patients faster. After all, he says, “when you’re dealing with large-scale data sets in hundreds, if not thousands, of locations, connecting the dots can be a complex problem.”

Regeneron isn’t the only company eager to derive more value from its data. Despite the enormous amounts of data they collect and the amount of capital they invest in data management solutions, business leaders are still not benefitting from their data. According to IDC research, 83% of CEOs want their organizations to be more data driven, but they struggle with the cultural and technological changes needed to execute an effective data strategy.

In response, many organizations, including Regeneron, are turning to a new form of data architecture as a modern approach to data management. In fact, by 2024, more than three-quarters of current data lake users will be investing in this type of hybrid “data lakehouse” architecture to enhance the value generated from their accumulated data, according to Matt Aslett, a research director with Ventana Research.

“Data lakehouse” is the term for a modern, open data architecture that combines the performance and optimization of a data warehouse with the flexibility of a data lake. But achieving the speed, performance, agility, optimization, and governance promised by this technology also requires embracing best practices that prioritize corporate goals and support enterprise-wide collaboration.

Download the report

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Deep Dive


What’s next for the world’s fastest supercomputers

Scientists have begun running experiments on Frontier, the world’s first official exascale machine, while facilities worldwide build other machines to join the ranks.

The future of open source is still very much in flux

Free and open software have transformed the tech industry. But we still have a lot to work out to make them healthy, equitable enterprises.

The beautiful complexity of the US radio spectrum

The United States Frequency Allocation Chart shows how the nation’s precious radio frequencies are carefully shared.

How ubiquitous keyboard software puts hundreds of millions of Chinese users at risk

Third-party keyboard apps make typing in Chinese more efficient, but they can also be a privacy nightmare.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.