Researchers have built a system to create a larger and more diverse data set on which to train medical AI.

You’re only as effective as your data set: Most artificial-intelligence programs rely on a large set of information to learn from. But if the data isn’t representative of all populations or circumstances, the system could be biased or ineffective.

The news: A new study out from chip company Nvidia, the Mayo Clinic, and the MGH & BWH Center for Clinical Data Science has created an algorithm that produces a more diverse set of medical data. Using generative adversarial networks (or GANs), synthetic scans depicting abnormalities can be created from existing MRIs of brain tumors.

Why it matters: “Diversity is critical to success when training neural networks, but medical imaging data is usually imbalanced,” Hoo Chang Shin, a research scientist at Nvidia, told ZDNet. “There are so many more normal cases than abnormal cases, when abnormal cases are what we care about, to try to detect and diagnose.”