Skip to Content



A new paradigm for managing data

Open data lakehouse architectures speed insights and deliver self-service analytics capabilities.

In partnership withDremio

Regeneron Pharmaceuticals, a biotechnology company that develops life-transforming medicines, found itself inundated with vast volumes of data during the peak of the covid-19 pandemic. In order to derive actionable information from these disparate data sets, which ranged from clinical trial data to real-time supply chain information, the company needed new ways to join and relate them, regardless of what format they were in or where they came from.

Stock graphic of an outward spiraling lines, dots, and rectangles

A new paradigm for managing data

Shah Nawaz, chief technology officer and vice president of digital technology and engineering at Regeneron, says, “At the time, everybody in the world was reporting on their covid-19 findings from different countries and in different languages.” The challenge was how to make sense of these massive data sets in a timely manner, assisting researchers and clinicians, and ultimately getting the best treatments to patients faster. After all, he says, “when you’re dealing with large-scale data sets in hundreds, if not thousands, of locations, connecting the dots can be a complex problem.”

Regeneron isn’t the only company eager to derive more value from its data. Despite the enormous amounts of data they collect and the amount of capital they invest in data management solutions, business leaders are still not benefitting from their data. According to IDC research, 83% of CEOs want their organizations to be more data driven, but they struggle with the cultural and technological changes needed to execute an effective data strategy.

In response, many organizations, including Regeneron, are turning to a new form of data architecture as a modern approach to data management. In fact, by 2024, more than three-quarters of current data lake users will be investing in this type of hybrid “data lakehouse” architecture to enhance the value generated from their accumulated data, according to Matt Aslett, a research director with Ventana Research.

“Data lakehouse” is the term for a modern, open data architecture that combines the performance and optimization of a data warehouse with the flexibility of a data lake. But achieving the speed, performance, agility, optimization, and governance promised by this technology also requires embracing best practices that prioritize corporate goals and support enterprise-wide collaboration.

Download the report

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Deep Dive


Learning to code isn’t enough

Historically, learn-to-code efforts have provided opportunities for the few, but new efforts are aiming to be inclusive.

IBM wants to build a 100,000-qubit quantum computer

The company wants to make large-scale quantum computers a reality within just 10 years.

Multi-die systems define the future of semiconductors

Multi-die system or chiplet-based technology is a big bet on high-performance chip design—and a complex challenge.

The inside story of New York City’s 34-year-old social network, ECHO

Stacy Horn set out to create something new and very New York. She didn’t expect it to last so long.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.