Leveraging Explainable Temporal-Modelling Machine Learning to Identify Distinct Multimorbidity Trajectory Profiles in Acute Myocardial Infarction
Journal:
medRxiv
Published Date:
Jan 16, 2026
Abstract
IntroductionAcute myocardial infarction (AMI) remains a leading cause of mortality, with the coexistence of other conditions (i.e., multimorbidity) complicating management and outcomes. Currently, healthcare providers see major challenges in consideration of the patient with a multimorbid profile, especially as this is a progressive issue where the temporal evolution of diseases is complex in nature, with a profound impact on clinical outcomes.
MethodsData on 12,701 AMI patients from the UK Biobank were selected for analysis from the cohort of 502,000 volunteers and then grouped into pre- (up to 1 year prior) and early (within 5 years) post-AMI periods. Using Dynamic Time Warping (DTW) clustering, sequences of ICD-10 diagnoses accumulated over time in the post-AMI period were used to cluster participants. Topic modelling of cluster-specific diagnoses informed thematic labels for these profiles (clusters) of AMI patients. Using data from pre-AMI, along with socio-demographic variables (age, IMD score, BMI, and sex), four predictive supervised models, namely, Logistic Regression, Random Forest, XGBoost, and CatBoost, were developed, with CatBoost achieving the highest accuracy for profile membership prediction. Model interpretability via SHapley Additive exPlanations (SHAP) identified key diagnostic categories that were driving profile assignments. Then, survival analyses compared SMART (Second Manifestations of Arterial Disease) risk scores across the profiles, adjusting for clinical covariates to evaluate adverse cardiovascular outcomes - death. Finally, Phenome-Wide Association Studies (PheWAS) were employed to link profile-specific diagnostic themes to underlying genetic mechanisms.
ResultsUsing the above approaches, three multimorbidity profiles were identified in the post-AMI period: Acute cardio-renal-respiratory instability with chronic metabolic disease (ACUTE-CARD), Cardiometabolic disease with mixed arrhythmic-ischemic burden (CARDIOMIX), and Smoking-related cardiovascular disease with multimorbidity (SMO-CARD). CatBoost predicted profile membership with AUROC 0.77. Participants in the SMO-CARD cluster showed the highest rates of mortality, while ACUTE-CARD had the most favourable outcomes (SMART risk score = 11.2, and 6.8% CVD deaths). SMO-CARD displayed a broad range of cardiopulmonary and systemic associations. PheWAS revealed profile-specific genetic associations and pathway enrichments were consistent with clinical features; for example, cardiometabolic genes were associated with the CARDIOMIX cluster, and immune-related pathways were associated with SMO-CARD, supporting the biological plausibility of these profiles.
ConclusionIntegrating temporal clustering with explainable machine learning reveals distinct multimorbidity patterns in AMI patients. This framework supports personalised risk stratification and outcome prediction in clinical care.