Machine learning-based prediction of sudden cardiac death in the general population using electronic health record data.
Journal:
European journal of preventive cardiology
Published Date:
Mar 6, 2026
Abstract
AIMS: The vast majority of sudden cardiac death (SCD) cases occur in the general population with few known risk factors instead of just patients already identified to be at high risk, making the prediction of SCD very difficult. Therefore, a better screening tool should be developed to facilitate early identification. METHODS AND RESULTS: To estimate the risk of SCD, we trained and validated a machine learning model on electronic health record (EHR) data covering 17 172 359 drug prescriptions and 1 639 057 hospital diagnoses up to 5 years for cases and controls. Training was done on data obtained from a cohort of 12 338 SCD cases in Greater Paris and 12 338 controls from 2011 to 2015. We then validated the results on two external cohorts: a temporal cohort in the same area from 2016 and 2020 with 11 620 SCD cases and 11 620 controls and a geographical cohort from the University of Washington (Seattle, USA) with 892 SCD cases and 892 controls from 2013 to 2021. In the 5 years preceding the SCD, cardiovascular diagnoses were prevalent in only a few patients, scarce in many patients, and totally nonexistent for 25.7% of subjects. Our model achieved an area under the curve of 0.81 [95% confidence interval (CI), 0.80-0.82] and 0.66 (95% CI, 0.58-0.73) in the validation and geographical cohort, respectively. The prediction model discriminated SCD from the general population, especially in the highest decile, where the model detected 26% and 33% of all SCD in the Paris and Seattle datasets, respectively. Our prediction model was specific to SCD and was not predictive of myocardial infarction. In addition to classical cardiovascular risk factors, various non-cardiovascular drugs and diagnoses contributed to the prediction model. CONCLUSION: We propose a prediction model designed to identify individuals at high risk of SCD among the general population. When combined with EHR, this artificial intelligence model has the potential to assist in risk stratification and may help inform the prioritization of preventive strategies, contributing to more targeted use of cardiovascular health resources.
Authors
Keywords
No keywords available for this article.