Using donated computing power and drawing on the theory of quantum mechanics, Harvard researchers have computationally screened 2.3 million organic molecules for properties relevant to photovoltaic applications and then organized them into a searchable, sortable database. The new library, which was released to the public today, will help guide the search for new organic photovoltaic materials.
The release of the database, announced by the White House’s Office of Science and Technology Policy, marks the second anniversary of the so-called Materials Genome Initiative, a federal effort to “double the pace of innovation, manufacture, and deployment of high-tech materials“—a process that normally can take years or decades. Agencies participating in the program, which aims to foster collaboration and data sharing among academic and private-sector materials science researchers, have awarded a total of $63 million to projects over the past year.
Crucial to the push is the use of huge amounts of computing power, and machine learning, to virtually test new materials and predict their properties. The idea is that these insights will make it easier and faster for engineers to find materials that behave a certain way. “It’s sort of like mapping out what you can do in principle—all the basic properties,” says Gerbrand Ceder, a professor of materials science and engineering at MIT. “And then people can do more targeted engineering.”
Ceder, an early promoter of the idea, was the first to use the term “materials genome”—as the name of his project to screen inorganic materials for properties that could lead to better energy technology, especially batteries. That project, now headed by both MIT and Lawrence Berkeley National Laboratory, was renamed the Materials Project when the White House wanted to launch the nationwide Materials Genome Initiative.
The large database released today represents the work of the so-called Harvard Clean Energy Project, headed by Alán Aspuru-Guzik, a chemistry professor at Harvard and one of MIT Technology Review’s Innovators under 35 in 2010. A goal of the project is to use high-throughput computing to help locate materials that can be used for more efficient organic electronics. In 2011, the group used computers to identify a material that, once synthesized and tested by collaborators at Stanford University, was shown have outstanding electronic properties (see “Speeding Up Materials Design”).
Identifying promising candidates for organic solar cells has been the focus of the project’s latest phase. The newly public screening library is organized according to properties attractive for solar cells, like the efficiency at which a material can convert the sun’s light into electricity. Organic materials generally do not do this very efficiently, but cells based on them could be lighter, more flexible, and potentially cheaper than those made from inorganic materials.
Currently only a few organic materials are known to be able to convert 10 percent or more of the energy in sunlight into electricity, and the world efficiency record for organic solar cells is just over 11 percent. In comparison, most silicon-based solar cells have efficiencies of 15 to 20 percent. The new library features 35,000 molecules that have the potential to be over 10 percent efficient, according to computer modeling. A thousand of them are predicted to be more than 11 percent efficient.
The project started with 26 molecular “fragments,” chosen because earlier experiments had shown they could be building blocks for molecules with desirable properties. Millions of combinations of those fragments were then tested, using a quantum chemistry model. It generally takes about 12 hours to screen one molecule, says Aspuru-Guzik. The computationally intensive calculations were performed with help from IBM’s World Community Grid, which allows volunteers to donate surplus computing power from their own machines to selected projects. This “human supercomputer,” as Aspuru-Guzik calls it, contributed over 17,000 years of computing time to the project, which has so far generated around 400 terabytes of data.
Releasing this information to the public is very important, says Ceder. It’s important to have “a lot of eyes on the data” for quality control, and because creative people will figure out how to use and improve the data in surprising and useful ways. “You cannot anticipate what can be done with your data,” he says.