Band Gap and Reorganization Energy Prediction of Conducting Polymers by the Integration of Machine Learning and Density Functional Theory.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

The performance and reliability of machine learning (ML)-quantitative structure-property relationship (QSPR) models depend on the quality, size, and diversity of the data set used for model training. In this study, we manually curated a large-scale data set containing 3120 donor-acceptor (D-A) conjugated polymers (CPs) by selecting the most utilized 60 donors and 52 acceptors. This data set serves as a valuable resource for ML-based prediction of key electronic properties such as band gap energy () and hole reorganization energy (λ), calculated using density functional theory (DFT) to advance organic photovoltaics (OPV). Beyond data set construction, we systematically investigated how different descriptor and fingerprint types impact performance of the ML model. Recognizing that not all features contributed equally to the model performance, we conducted an in-depth analysis to identify the most informative descriptors for the fundamental optoelectronic properties. Our findings show that kernel partial least-squares (KPLS) regression utilizing radial and molprint2D fingerprints achieved the highest accuracy in predicting , with values of 0.899 and 0.897, respectively. For λ prediction, models integrating electronic descriptors such as frontier orbital energy levels significantly improved performance, achieving an value of 0.830. This study provides a comprehensive investigation of how different descriptors influence model performance in OPV research. By analyzing why certain models succeed while others fail, our findings offer insight into feature selection and data set optimization for accurate target property prediction in organic electronics. The developed ML models provide a predictive framework for high-performance OPV materials design, significantly reducing the reliance on labor-intensive experimental procedures and computationally expensive first-principle calculations.

Authors

  • Tugba Haciefendioglu
    Department of Chemistry, Middle East Technical University, 06800 Ankara, Turkey.
  • Erol Yildirim
    Department of Chemistry, Middle East Technical University, 06800 Ankara, Turkey.