Data is needed to help AI find new materials. Meta is giving away massive amounts for free
“We’re really firm believers that by contributing to the community and building upon open-source data models, the whole community moves further, faster,” says Larry Zitnick, the lead researcher for the OMat project.
Zitnick says the newOMat24 model will top the Matbench Discovery leaderboard, which ranks the best machine-learning models for materials science. The data set of this model will be among the largest available.
“Materials science is having a machine-learning revolution,” says Shyue Ping Ong, a professor of nanoengineering at the University of California, San Diego, who was not involved in the project.
Previously, scientists were limited to doing very accurate calculations of material properties on very small systems or doing less accurate calculations on very big systems, says Ong. These processes were time-consuming and costly. He says that machine learning has closed the gap and AI models have allowed scientists to simulate combinations of elements from any periodic table more quickly and inexpensively. The decision by Meta to openly share its data is more important than the AI model, according to Gabor Csanyi. He is a professor of Molecular Modeling at the University of Cambridge and was not involved with the project. Csanyi says that this is in stark contrast with other major industry players, such as Google and Microsoft who have also published models that look competitive but were actually trained using secret data sets. Meta sampled material from an existing data set called Alexandria to create OMat24. They then ran different simulations and calculations to scale it. Ong says that other data sets may not be of high quality.