Skip to Content

Over 24,000 coronavirus research papers are now available in one place

The data set aims to accelerate scientific research that could fight the Covid-19 pandemic.
March 16, 2020
A scientist conducting research.
A scientist conducting research.
A scientist conducting research.Laurence Dutton / Getty

The news: Today researchers collaborating across several organizations released the Covid-19 Open Research Dataset (CORD-19), which includes over 24,000 research papers from peer-reviewed journals as well as sources like bioRxiv and medRxiv (websites where scientists can post non-peer-reviewed preprint papers). The research covers SARS-CoV-2 (the scientific name for the coronavirus), Covid-19 (the scientific name for the disease), and the coronavirus group. It represents the most extensive collection of scientific literature related to the ongoing pandemic and will continue to update in real time as more research is released.

How it came together: The database was compiled under the request of the White House Office of Science and Technology Policy (OSTP) through a collaboration between three organizations. The National Library of Medicine (NLM) at the National Institutes of Health provided access to existing scientific publications; Microsoft used its literature curation algorithms to find relevant articles; and research nonprofit the Allen Institute for Artificial Intelligence (AI2) converted them from web pages and PDFs into a structured format that can be processed by algorithms. The database is now available on AI2’s Semantic Scholar website.

What has already been done: As part of its Semantic Scholar service, which allows the scientific community to easily search through academic literature, AI2 has already processed the new corpus using the same information extraction and analysis techniques that it applies to all new research. It’s surfacing key pieces of information such as authors, methods, data, and citations to make it easier for scientists to quickly evaluate how each paper adds to the existing research.

It’s also using state-of-the-art natural-language models like ELMo and BERT to map out the similarities between papers. This map is now powering a new feature on Semantic Scholar that allows researchers to create a personalized research feed based on their interests.

Why it matters: Scientists are rushing against the clock to answer pressing questions about the nature of the virus in hopes of stemming its spread. The database not only helps them consolidate existing research in one place but also makes the body of literature easier to mine for insights with natural-language processing algorithms. The OSTP has launched an open call for AI researchers to develop new techniques for text and data mining that will help the medical community comb through the mass of information faster.

Deep Dive

Artificial intelligence

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

storm front
storm front

DeepMind’s AI predicts almost exactly when and where it’s going to rain

The firm worked with UK weather forecasters to create a model that was better at making short term predictions than existing systems.

People are hiring out their faces to become deepfake-style marketing clones

AI-powered characters based on real people can star in thousands of videos and say anything, in any language.

Tentacle of Octopus
Tentacle of Octopus

What an octopus’s mind can teach us about AI’s ultimate mystery

Machine consciousness has been debated since Turing—and dismissed for being unscientific. Yet it still clouds our thinking about AIs like GPT-3.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.