An efficient catalyst screening strategy combining machine learning and causal inference.
Journal:
Journal of environmental management
PMID:
39999759
Abstract
Due to the diversity of catalyst synthesis methods, the optimization of catalysts by traditional experimental methods have brought greater challenges. This study presents a new strategy for determining catalyst performance by substituting causal inference results as prior knowledge into machine learning models, which was used to explore the correlation between the ratio of nitrogen functional groups in catalysts and degradation performance, so as to solve the problem of low efficiency in catalyst screening. A dataset comprising 14 critical parameters, including the physicochemical properties of catalysts and reaction conditions, was established through the analysis of 182 experimental results. The analysis results based on real data show that CatBoost model performs best (R = 0.953, MAE = 3.277, RMSE = 5.615). SHAP analysis showed that pyridinic N was a key N-functional group that affects the degradation performance of BPA. DoWhy causal inference further verified the positive effect of pyridinic N, with causal effect estimate of 0.4388. This strategy reduces the selection range of the best catalyst through causal inference pre-screening, and used CatBoost model to accurately evaluate the performance of its catalyst, which can reduce the catalyst screening process from multiple processes to a single process, and significantly improve the catalyst selection efficiency.