For several years, clinicians and computer scientists in the U.S. and abroad have been trying to improve cancer care–from diagnosis to treatment–by building vast, interconnected databases full of patient information. They call these repositories “medical grids” and envision the day when a physician in Strasbourg or New Delhi can see, for example, that an indecipherable image of a patient’s lung is very similar to that of a San Francisco patient, whose case history could inform the decision to perform a biopsy.
These nascent databases include not only patients’ medical histories, including such data as MRIs and CT scans, but also information about how they have responded to drugs. But the benefits of these under–construction grids have been slow to come, partly because of technical problems and partly because federal privacy rules make data sharing difficult. Now, a National Cancer Institute project could test a multihospital system for comparing lung cancer images as early as this year–a clear move toward putting grids to use.
Kenneth H. Buetow, director of the institute’s Center for Bioinformatics in Bethesda, MD, calls it a crucial first step toward “a World Wide Web of cancer research.”
In the past year or so, Buetow and his team have collected more than 50,000 images of lung cancers obtained from medical trials and archived them in a secure electronic repository at NCI. Their effort is part of a three-year, $60 million pilot project launched in 2004, which involves 50 cancer centers and more than 600 researchers. The archive is now available on the Internet at http://ncia.nci.nih.gov. In addition to other imaging projects, it contains a large collection of lung cancer cases followed throughout their therapy.
With the database now largely in place, testing is imminent. The image collection is intended to encourage and facilitate research into new software that can automatically compare images of lungs with those already in the database. In such software, algorithms will search for commonalities and build a directory of the likeliest matches. Clinicians in offices and hospitals will be able to contrast the resulting lung images with the scans they need to evaluate.
Comparing images is just the first step. If all goes well, within three years the National Cancer Institute hopes to conduct one or more clinical trials where a vast amount of medical data about lung cancer–including images, types of tumors, drug courses, patient outcomes, even the molecular profiles of the disease–would be used by physicians studying specific cases. The outcomes of these cases would be compared to those of cases treated through conventional approaches to cancer diagnosis. That comparison should yield information not just about the medical response of the patients but also about the accuracy with which the doctors made their diagnoses, and even the degree to which they adhered to standards of medical privacy.
Medical-grid researchers are not short on vision. Comparing images is just the first step. In cases where the scans match, doctors hope to be able to bore deeper into the histories of similar cases and learn which drugs or surgeries worked best. And Buetow says his trials could actually hasten the day when some cancer diagnoses are automated. A doctor could input images (and as the grid expands, blood test results, descriptions of genetic markers, and other patient data) and learn how frequently near-identical test results from patients around the world correlate with specific malignancies such as lymphomas, melanomas, or sarcomas.
And in the future, as gene-sequencing costs come down, the NCI’s grid could even include patients’ genomic information. “The power of the grid is in its capability to aggregate and correlate more and more public-health data from around the world,” said Mary Kratz of the University of Michigan Medical School, a technical advisor to the grid research community. “The more data you have, the more knowledge you generate.”
Meanwhile, mundane technical problems need solving.
Since the data that accompany images vary in type and format from hospital to hospital, researchers are developing standard formats that can harmonize them all. “We’re asking researchers at many competitive institutions to tear down barriers to sharing vast amounts of data,” says Howard Bilofsky, senior fellow at the Center for Bioinformatics at the University of Pennsylvania, which participates in NCI’s project. “Being able to share information in grids across the world in the arena of life science research is not something that is easily done.”
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
Data analytics reveal real business value
Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.