Key factors in predictive analysis of cardiovascular risks in public health.

Journal: Scientific reports
Published Date:

Abstract

This research emphasizes the role of analytics in evaluating the risk of disease (CVD) focusing on thorough data preparation and feature engineering for accurate predictions. We studied machine learning (ML) and learning (DL) models, such as Logistic Regression (LR) Random Forest (RF) Gradient Boosting Machines (GBM) and Multilayer Perceptron (MLP). Each model's performance was assessed using metrics like accuracy, precision, recall, F1 score and ROC AUC to determine their reliability and practical relevance. Our analysis shows the strengths of each model category. Conventional ML models like Random Forest and Gradient Boosting Machines were effective in identifying patients at risk achieving up to 74% accuracy and 72% recall. On the hand, deep learning models like Multilayer Perceptron excelled in handling data with an impressive ROC AUC score of approximately 80%. Despite the need for resources and extensive data preprocessing these models are highly skilled at pinpointing crucial risk factors, crucial, for long term CVD management.

Authors

  • Ghazi I Al Jowf
    Department Public Health, College of Applied Medical Sciences, King Faisal University, Al Hofuf, 37912, Al Ahsa, Saudi Arabia. galjowf@kfu.edu.sa.
  • Manjur Kolhar
    Department of Health Management and Information Technology, College of Applied Medical Sciences, King Faisal University, Al-Ahsa, 36362, Saudi Arabia. mkolhar@kfu.edu.sa.