Predicting coronary heart disease with advanced machine learning classifiers for improved cardiovascular risk assessment.
Journal:
Scientific reports
PMID:
40247042
Abstract
Worldwide, coronary heart disease (CHD) is a leading cause of mortality, and its early prediction remains a critical challenge in clinical data analysis. Machine learning (ML) offers valuable diagnostic support by leveraging healthcare data to enhance decision-making and prediction accuracy. Although numerous studies have applied ML classifiers for heart disease prediction, their contributions often lack clarity in addressing key challenges. In this paper, we present a comprehensive ML framework that systematically tackles these issues. First, we employ mutual information (MI) for effective feature selection to isolate the most informative predictors. Second, we address the significant class imbalance in the dataset using the Synthetic Minority Oversampling Technique (SMOTE), which substantially improves model training. Third, we propose a novel hybrid model that integrates particle swarm optimization (PSO) with an artificial neural network (ANN) to optimize feature weighting and bias training. Additionally, we conduct a comparative analysis with traditional classifiers, including Logistic Regression and Random Forest, using the National Health and Nutritional Examination Survey dataset. Our results demonstrate that while conventional classifiers achieve an accuracy of 95.8%, the proposed PSO-ANN model attains an enhanced accuracy of up to 97% in predicting CHD. This work clearly defines its contributions by improving feature selection, handling data imbalance, and introducing an innovative hybrid model for superior prediction performance.