The Open Materials 2024 (OMat24) inorganic materials dataset and models.
Journal:
Nature computational science
Published Date:
Jun 2, 2026
Abstract
The discovery and simulation of inorganic materials is core to diverse applications from climate change to semiconductor manufacturing. Artificial intelligence has the potential to dramatically accelerate materials simulation, discovery and design. Although considerable progress has been made in developing training datasets and machine learning interatomic potential architectures, the state of the art in openly available and reproducible datasets and models has lagged behind proprietary models. Here, to address this issue, we present the Open Materials 2024 (OMat24) dataset, comprising over 110 million density functional theory calculations across diverse chemistries, materials and configurations. Machine learning interatomic potential models trained on OMat24 achieve leading performance on the Matbench-Discovery leaderboard, surpassing previous models with F1 scores above 0.9 for stability and approximately 20 meV per atom accuracy for formation energy. Models trained on OMat24 also exhibit high accuracy in newly developed thermal conductivity and phonon prediction task benchmarks. We show that OMat24's diversity corrects the consistent softening bias of prior models trained on less diverse datasets, which systematically underpredicted energy, forces and derivative properties such as phonons. The OMat24 dataset has enabled the research community to develop improved model architectures, resulting in a step-change improvement in inorganic material property prediction accuracy.
Authors
Keywords
No keywords available for this article.