Machine Learning Methods to Predict Density Functional Theory B3LYP Energies of HOMO and LUMO Orbitals.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Machine learning algorithms were explored for the fast estimation of HOMO and LUMO orbital energies calculated by DFT B3LYP, on the basis of molecular descriptors exclusively based on connectivity. The whole project involved the retrieval and generation of molecular structures, quantum chemical calculations for a database with >111 000 structures, development of new molecular descriptors, and training/validation of machine learning models. Several machine learning algorithms were screened, and an applicability domain was defined based on Euclidean distances to the training set. Random forest models predicted an external test set of 9989 compounds achieving mean absolute error (MAE) up to 0.15 and 0.16 eV for the HOMO and LUMO orbitals, respectively. The impact of the quantum chemical calculation protocol was assessed with a subset of compounds. Inclusion of the orbital energy calculated by PM7 as an additional descriptor significantly improved the quality of estimations (reducing the MAE in >30%).

Authors

  • Florbela Pereira
    LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa , 2829-516 Caparica, Portugal.
  • Kaixia Xiao
    Henan Engineering Research Center of Industrial Circulating Water Treatment, College of Chemistry and Chemical Engineering, Henan University , Kaifeng, 475004, PR China.
  • Diogo A R S Latino
    LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa , 2829-516 Caparica, Portugal.
  • Chengcheng Wu
    Henan Engineering Research Center of Industrial Circulating Water Treatment, College of Chemistry and Chemical Engineering, Henan University , Kaifeng, 475004, PR China.
  • Qingyou Zhang
    Institute of Environmental and Analytical Sciences, College of Chemistry and Chemical Engineering, Henan University, Kaifeng, 475004, PR China.
  • João Aires-de-Sousa
    LAQV-REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa, 2829-516 Caparica, Portugal phone/fax: +351 21 2948300. joao@airesdesousa.com.