Dynamic Stroke Risk Stratification via Machine Learning: A Multi-Level Single-Center Study
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
Stroke is a leading global public health challenge and the second leading cause of death worldwide. In China, its burden continues to escalate amid population aging and a growing prevalence of unhealthy lifestyles. Traditional static stroke risk prediction models, constrained by cross-sectional data, fail to capture dynamic changes in physiological parameters and behavioral factors, resulting in inherent limitations. This study therefore aimed to construct and validate a multi-tiered dynamic stroke risk prediction system. A single-center longitudinal prospective cohort study (2018–2022) enrolled community-dwelling populations and outpatients who completed three consecutive follow-ups. Three sequential machine learning models were developed, targeting general population screening, high-risk population refinement, and longitudinal population monitoring, respectively. Stratified cross-validation was used for model validation (10-fold for Model 1, 5-fold for Models 2 and 3), with the area under the curve (AUC) as the primary evaluation metric. Model 1 (for general population high-risk conversion prediction) achieved an AUC of 0.835, which significantly reduced the missed detection rate of individuals with “normal static indicators but abnormal dynamic trends.” Model 2 (for high-risk population stroke onset prediction) had LGB_Conservative as its optimal algorithm, with an AUC of 0.8479. Model 3 (for longitudinal population dual-outcome prediction) showed an AUC of 0.761, with stroke event rates of 3.1% in the low-risk group and 95.2% in the very high-risk group. This multi-tiered dynamic prediction system effectively addresses the limitations of traditional static models, yet requires external validation using multi-center data to confirm its generalizability. It provides a novel tool for personalized stroke prevention in clinical and public health practice.