Machine learning-based activity prediction of phenoxy-imine catalysts and its structure-activity relationship study.
Journal:
Molecular diversity
Published Date:
Mar 7, 2025
Abstract
This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.