Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets.
Journal:
Scientific reports
PMID:
40263348
Abstract
Heart disease is one of the leading causes of death worldwide. Predicting and detecting heart disease early is crucial, as it allows medical professionals to take appropriate and necessary actions at earlier stages. Healthcare professionals can diagnose cardiac conditions more accurately by applying machine learning technology. This study aimed to enhance heart disease prediction using stacking and voting ensemble methods. Fifteen base models were trained on two different heart disease datasets. After evaluating various combinations, six base models were pipelined to develop ensemble models employing a meta-model (stacking) and a majority vote (voting). The performance of the stacking and voting models was compared to that of the individual base models. To ensure the robustness of the performance evaluation, we conducted a statistical analysis using the Friedman aligned ranks test and Holm post-hoc pairwise comparisons. The results indicated that the developed ensemble models, particularly stacking, consistently outperformed the other models, achieving higher accuracy and improved predictive outcomes. This rigorous statistical validation emphasised the reliability of the proposed methods. Furthermore, we incorporated explainable AI (XAI) through SHAP analysis to interpret the model predictions, providing transparency and insight into how individual features influence heart disease prediction. These findings suggest that combining the predictions of multiple models through stacking or voting may enhance the performance of heart disease prediction and serve as a valuable tool in clinical decision-making.