Predicting Stokes Shifts of Fluorescent Chromophores Using Ensemble Learning and H2OAutoML.
Journal:
Journal of fluorescence
Published Date:
Aug 2, 2025
Abstract
This manuscript presents a comparative study of machine learning (ML) models in predicting the Stokes shifts of fluorescent chromophores. The performance of various ML models, including H2OAutoML, is evaluated. It is found that the Historical Gradient Boosting regressor is a top performer, achieving its R-Squared (R) value of 0.71, while H2OAutoML achieves an R of 0.83 in contrast. The relevant feature analysis reveals that Burden-CAS-University of Texas 2D with low logP eigenvalues (BCUT2D_LOGPLOW) exhibits high correlation with the Stokes shift, likely due to its ability to capture molecular properties related to polarizability and charge distribution. The Shapley analysis identifies NumAliphaticRings as their most important descriptor to impact their model performance, which follows VSA_Estate2 and fr_bicyclc descriptors. The model K-Fold cross-validation demonstrates a consistent performance across different data folds. Additionally, t-SNE mapping of the dataset reveals a compact cluster with a range of -50 to 50 for both dimensions. The current findings highlight the potential of ML-related regressors to predict the Stokes shift, which is an important property for fluorescent materials, and provide insights into molecular descriptors.
Authors
Keywords
No keywords available for this article.