The news: Today researchers collaborating across several organizations released the Covid-19 Open Research Dataset (CORD-19), which includes over 24,000 research papers from peer-reviewed journals as well as sources like bioRxiv and medRxiv (websites where scientists can post non-peer-reviewed preprint papers). The research covers SARS-CoV-2 (the scientific name for the coronavirus), Covid-19 (the scientific name for the disease), and the coronavirus group. It represents the most extensive collection of scientific literature related to the ongoing pandemic and will continue to update in real time as more research is released.
How it came together: The database was compiled under the request of the White House Office of Science and Technology Policy (OSTP) through a collaboration between three organizations. The National Library of Medicine (NLM) at the National Institutes of Health provided access to existing scientific publications; Microsoft used its literature curation algorithms to find relevant articles; and research nonprofit the Allen Institute for Artificial Intelligence (AI2) converted them from web pages and PDFs into a structured format that can be processed by algorithms. The database is now available on AI2’s Semantic Scholar website.
More on coronavirus
Our most essential coverage of covid-19 is free, including:
Newsletter: Coronavirus Tech Report
Zoom show: Radio Corona
What has already been done: As part of its Semantic Scholar service, which allows the scientific community to easily search through academic literature, AI2 has already processed the new corpus using the same information extraction and analysis techniques that it applies to all new research. It’s surfacing key pieces of information such as authors, methods, data, and citations to make it easier for scientists to quickly evaluate how each paper adds to the existing research.
It’s also using state-of-the-art natural-language models like ELMo and BERT to map out the similarities between papers. This map is now powering a new feature on Semantic Scholar that allows researchers to create a personalized research feed based on their interests.
Why it matters: Scientists are rushing against the clock to answer pressing questions about the nature of the virus in hopes of stemming its spread. The database not only helps them consolidate existing research in one place but also makes the body of literature easier to mine for insights with natural-language processing algorithms. The OSTP has launched an open call for AI researchers to develop new techniques for text and data mining that will help the medical community comb through the mass of information faster.
A horrifying new AI app swaps women into porn videos with a click
Deepfake researchers have long feared the day this would arrive.
DeepMind’s AI predicts almost exactly when and where it’s going to rain
The firm worked with UK weather forecasters to create a model that was better at making short term predictions than existing systems.
People are hiring out their faces to become deepfake-style marketing clones
AI-powered characters based on real people can star in thousands of videos and say anything, in any language.
What an octopus’s mind can teach us about AI’s ultimate mystery
Machine consciousness has been debated since Turing—and dismissed for being unscientific. Yet it still clouds our thinking about AIs like GPT-3.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.