Benchmarking data efficiency in Δ-ML and multifidelity models for quantum chemistry.
Journal:
The Journal of chemical physics
Published Date:
Jul 14, 2025
Abstract
The development of machine learning (ML) methods has made quantum chemistry (QC) calculations more accessible by reducing the computational cost incurred in conventional QC methods. This has since been translated into the overhead cost of generating training data. Increased work in reducing the cost of generating training data resulted in the development of Δ-ML and multifidelity machine learning methods, which use data at more than one QC level of accuracy, or fidelity. This work compares the data costs associated with Δ-ML, multifidelity machine learning (MFML), and optimized MFML in contrast with a newly introduced MultifidelityΔ-Machine Learning (MFΔML) method for the prediction of ground state energies, vertical excitation energies, and the magnitude of the electronic contribution of molecular dipole moments from the multifidelity benchmark dataset QeMFi. This assessment is made on the basis of the training data generation cost associated with each model and is compared with the single fidelity kernel ridge regression case. The results indicate that the use of multifidelity methods surpasses the standard Δ-ML approaches in cases of a large number of predictions. In applications where only a few numbers of predictions/evaluations are to be made using ML models, the herein developed MFΔML method is shown to provide an added advantage over conventional Δ-ML.
Authors
Keywords
No keywords available for this article.