Predicting the timing of first sustained cognitive worsening in Alzheimer's disease using real-world clinical data and machine learning
Journal:
medRxiv
Published Date:
Jun 4, 2026
Abstract
Background: Cognitive assessments are sparsely documented in electronic health records (EHRs), limiting scalable detection of cognitive worsening in real-world clinical settings. Methods: We applied a deep neural network optimized for identifying clinical event timing from sparsely labeled gold-standard data (label-efficient incident phenotyping from longitudinal EHR, LATTE) to predict time-to-first sustained cognitive worsening in AD patients from a large healthcare system (2011-2022) with linkage to an AD Research Center registry in a subset. Sustained cognitive worsening was defined as cognitive decline persisting over [≥]2 consecutive visits within 3 years. Separate LATTE models were trained with worsening labels from Clinical Dementia Rating (CDR), Mini-Mental Status Examination (MMSE), and Montreal Cognitive Assessment (MoCA) scores; semi-supervised learning scaled predictions to larger imputation cohorts lacking sufficient longitudinal scores. We evaluated model performance using average time-specific area under the receiver operating characteristic curve (AUC), area between curves (ABC), and Brier scores. To demonstrate clinical utility, we examined whether predicted time-to-worsening differentiated clinically meaningful patient subgroups using competing-risk Cox proportional hazards models accounting for death. Findings: The cohort comprised 27,614 AD patients (65% women, 91% non-Hispanic White, mean [SD] age at start of follow-up 78.76 [9.53] years). In gold-standard cohorts (n: CDR=632, MMSE=710, MoCA=752; remaining patients formed imputation cohorts), LATTE demonstrated robust predictive performance (average time-AUC: CDR 0.816, MMSE 0.694, MoCA 0.710; ABC: CDR 0.067, MMSE 0.293, MoCA 0.078; Brier score: CDR 0.252, MMSE 0.437, MoCA 0.295). APOE-{varepsilon}4 carriers had shorter predicted time-to-worsening compared to non-carriers across all assessments in the imputation cohorts (HRs 1.241-1.376, all p<0.025), and k-means derived patient clusters showed differential time-to-worsening in the overall and imputation cohorts (HRs 0.777-0.908, all p<.001). Interpretation: LATTE enables scalable prediction of sustained cognitive worsening timing, differentiating clinically meaningful patient subgroups. This approach could improve AD clinical monitoring and decision-making in routine care and support targeted clinical trial enrichment.