Predicting Oxidation Potentials with DFT-Driven Machine Learning.
Journal:
Journal of chemical information and modeling
Published Date:
May 28, 2025
Abstract
We introduce OxPot, a comprehensive open-access data set comprising over 15 thousand chemically diverse organic molecules. Leveraging the precision of DFT-derived highest occupied molecular orbital energies (), OxPot serves as a robust platform for accelerating the prediction of oxidation potential (). Using the PBE0 hybrid functional and cc-pVDZ basis set, we establish a strong near-linear correlation between and experimental values, achieving an exceptional correlation coefficient () of 0.977 and a low root-mean-square error (RMSE) of 0.064. The correlation highlights the accuracy of OxPot as a machine learning (ML)-ready resource for prediction. To further facilitate future development of ML models, we extensively tested various algorithms and conducted a thorough feature importance analysis. This analysis offers valuable insights into the key molecular descriptors that influence predictions, thereby enhancing model interpretability and guiding the design of more effective predictive models. Furthermore, the computational efficiency of the methodology ensures rapid predictions of for additional chemically similar molecules, thereby increasing its applicability for large-scale molecular screening and broader applications in chemical research.