Acute-Phase Machine Learning Prediction of 12-Month Aphasia and Discourse Recovery
Journal:
medRxiv
Published Date:
May 17, 2026
Abstract
Approximately 30-40% of stroke patients retain aphasia at 12 months. Early forecasting may guide rehabilitation and prognostic enrichment of clinical trials, yet machine learning (ML) prediction of language recovery has typically relied on chronic-phase data unavailable at the acute decision point. Whether acute features predict 12-month outcomes, and whether global severity and connected-speech recovery share substrates in an ML framework, is untested. We studied 73 patients with acute left-hemisphere ischemic stroke and aphasia (mean 2.8 days post-onset). Two 12-month outcomes were defined: aphasia resolution (Western Aphasia Battery-Revised Aphasia Quotient [WAB-AQ] [≥]93.8) and discourse normalization (Modern Cookie Theft content units [≥]22.1; N=61). Four ML algorithms were trained on four hierarchical feature sets (clinical, volumetric, anatomical, network-disconnection) using nested cross-validation and SHapley Additive exPlanations (SHAP) stability analysis. Acute WAB-AQ dominated (mean |SHAP| = 13.60, ~20x the next feature). For aphasia resolution, random forest achieved F1 = 0.874 (95% CI, 0.800-0.941), Pearson r = 0.827, mean absolute error (MAE) = 7.26 WAB-AQ points; clinical features alone achieved F1 = 0.851. For discourse, support vector regression achieved F1 = 0.725 (95% CI 0.593-0.831), r = 0.617, MAE = 8.96 content units. Three predictors were shared (acute WAB-AQ, lesion volume, left pars triangularis); ventral-stream tracts were linked to aphasia resolution, whereas interhemispheric and prefrontal connectivity were linked to discourse. Both models overpredicted severe chronic outcomes. Acute-phase ML forecasts 12-month aphasia resolution accurately and discourse more modestly. Clinical features carry most predictive variance; acute imaging reveals shared and outcome-specific substrates mapping onto dual-stream architecture, supporting early stratification for rehabilitation and prognostic trial enrichment.