Machine Learning-Based Model for Postoperative Stroke Prediction in Coronary Artery Disease
Journal:
arXiv
Published Date:
Mar 15, 2025
Abstract
Coronary artery disease remains one of the leading causes of mortality
globally. Despite advances in revascularization treatments like PCI and CABG,
postoperative stroke is inevitable. This study aims to develop and evaluate a
sophisticated machine learning prediction model to assess postoperative stroke
risk in coronary revascularization patients.This research employed data from
the MIMIC-IV database, consisting of a cohort of 7023 individuals. Study data
included clinical, laboratory, and comorbidity variables. To reduce
multicollinearity, variables with over 30% missing values and features with a
correlation coefficient larger than 0.9 were deleted. The dataset has 70%
training and 30% test. The Random Forest technique interpolated residual
dataset missing values. Numerical values were normalized, whereas categorical
variables were one-hot encoded. LASSO regularization selected features, and
grid search found model hyperparameters. Finally, Logistic Regression, XGBoost,
SVM, and CatBoost were employed for predictive modeling, and SHAP analysis
assessed stroke risk for each variable. AUC of 0.855 (0.829-0.878) showed that
SVM model outperformed logistic regression and CatBoost models in prior
research. SHAP research showed that the Charlson Comorbidity Index (CCI),
diabetes, chronic kidney disease, and heart failure are significant prognostic
factors for postoperative stroke. This study shows that improved machine
learning reduces overfitting and improves model predictive accuracy. Models
using the CCI alone cannot predict postoperative stroke risk as accurately as
those using independent comorbidity variables. The suggested technique provides
a more thorough and individualized risk assessment by encompassing a wider
range of clinically relevant characteristics, making it a better reference for
preoperative risk assessments and targeted intervention.