Parsimonious and explainable machine learning for predicting mortality in patients post hip fracture surgery.
Journal:
Scientific reports
Published Date:
Jul 2, 2025
Abstract
Hip fractures among the elderly population continue to present significant risks and high mortality rates despite advancements in surgical procedures. In this study, we developed machine learning (ML) algorithms to estimate 30-day mortality risk post-hip fracture surgery in the elderly using data from the National Surgical Quality Improvement Program (NSQIP 2012-2017, nā=ā62,492 patients). Our approach involves two models: one estimating the patients' 30-day mortality risk based on pre-operative conditions, and another considering both pre-operative and post-operative factors. We performed comprehensive data cleaning and preprocessing, then applied tenfold cross-validation with randomized search to the training set to identify optimal hyperparameters for various machine learning models. We used logistic regression, Naive Bayes, random forest, AdaBoost, XGBoost, CatBoost, Gradient Boosting, and LightGBM. The models' performances were evaluated on the test set using the Area Under the Receiver Operating Characteristic Curve (AUC). The best pre-operative model was AdaBoost, achieving an AUC of 0.792 with 29 features (predictors), and the best post-operative model was CatBoost, achieving an AUC of 0.885 with 45 features. After modeling, we derived feature importance for each of the two models and decreased the number of features to reach a parsimonious highly performing model. The pre-operative model achieves an AUC of 0.725 with the eight most important features and the post-operative model achieves an AUC of 0.8529 with the six most important features. To ensure the models' decision-making is compatible with clinical decisions and common practices, we applied explainability techniques such as SHAP to reveal the patterns learned by the models. These patterns were found to be clinically plausible. In summary, our approach involving data preprocessing, model tuning, feature selection, and explainability achieved state-of-the-art performance in predicting 30-day mortality rates following hip fractures surgery using a limited set of features, making it highly applicable in clinical settings.