Predicting Coronary Heart Disease Using Data Mining and Machine Learning Solutions.

Journal: Anais da Academia Brasileira de Ciencias
Published Date:

Abstract

This research focuses on predicting cardiovascular disease using machine learning classification strategies. The study presents a unique approach by integrating multiple machine learning techniques, leveraging the strengths of Random Forest and Gradient Boosting. The authors developed a novel ensemble learning model, combining Linear Regression, Random Forest, and Gradient Boosting algorithms, optimized using Bayesian hyperparameter tuning. The model demonstrated superior performance in predicting CVD outcomes, with classification accuracy of 95.5%, 94.26%, and 98.3% for Linear Regression, Decision Tree, and Gradient Boosted methods, respectively. The true positive rate for the GB algorithm's predictions of patients was 98.3%. The study hypothesizes that the GB method predicts the Framingham dataset better than other algorithms using 4240 samples.

Authors

  • Vijai M Moorthy
    Vignan's Foundation for Science Technology & Research, Department of Advanced Computer Science and Engineering, 522202 Guntur, Andhra Pradesh, India.
  • Bhupal N Dharamsoth
    Vignan's Foundation for Science Technology & Research, Department of Advanced Computer Science and Engineering, 522202 Guntur, Andhra Pradesh, India.
  • Vijayalakshmi Muthukaruppan
    Vignan's Foundation for Science Technology & Research, Department of Advanced Computer Science and Engineering, 522202 Guntur, Andhra Pradesh, India.
  • Arul Elango
    Vignan's Foundation for Science Technology & Research, Department of Advanced Computer Science and Engineering, 522202 Guntur, Andhra Pradesh, India.
  • Kalaiarasi Ganesan
    Vignan's Foundation for Science Technology & Research, Department of Advanced Computer Science and Engineering, 522202 Guntur, Andhra Pradesh, India.