Hybrid CNN-GRU-XGBoost framework for optimized coronary artery disease diagnosis and risk stratification.
Journal:
Computational biology and chemistry
Published Date:
Oct 17, 2025
Abstract
Coronary artery disease (CAD) remains a leading driver of cardiovascular mortality, requiring diagnostic systems that deliver high discrimination, stability under class imbalance, and reproducible deployment. This work presents a hybrid pipeline that integrates convolutional encoders for localized feature interactions, gated recurrent units (GRUs) for conditional dependency modeling over ordered clinical attributes, and an extreme gradient-boosted tree classifier (XTree) for nonlinear decision refinement and feature-level attribution. The data pathway applies strict train-split isolation with source-aware quantile imputation, proximal denoising, robust trimmed standardization, stratified partitioning that preserves class and source distributions, and manifold-conformal minority augmentation to improve boundary coverage without leakage. Evaluation on the UCI Heart Disease cohort (Cleveland, Long Beach V, Switzerland, Hungary; p=14 attributes) used an 80/20 holdout and standard metrics. The proposed CNN-GRU-XTree attained 96.03% accuracy, 94.70% precision, 97.66% sensitivity, 96.17% F1, and 94.35% specificity. Relative to the strongest non-proposed baseline (CNN-LSTM-XTree: 95.63% accuracy, 94.70% precision, 96.90% sensitivity, 95.80% F1, 94.47% specificity), gains reached +0.40 percentage points (pp) in accuracy, +0.76 pp in sensitivity, and +0.37 pp in F1, with parity in precision and a negligible specificity delta (-0.12 pp). Against CNN-GRU-RF (95.24%/94.70%/96.15%/95.42%/94.18%), improvements were +0.79 pp accuracy, +1.51 pp sensitivity, +0.75 pp F1, and +0.17 pp specificity. Case-based simulations (600 min each) probed model behavior under clinically distinct conditions. In severe, unequivocal CAD, sensitivity remained ≥97% with specificity >93%; in low-risk asymptomatic profiles, specificity remained ≥95% with precision ≥93%; in borderline phenotypes with overlapping risk markers, F1 exceeded 94% while maintaining balanced error profiles.
Authors
Keywords
No keywords available for this article.