Multimodal Deep Learning for Stroke Prediction and Detection using Retinal Imaging and Clinical Data
Journal:
arXiv
Published Date:
May 5, 2025
Abstract
Stroke is a major public health problem, affecting millions worldwide. Deep
learning has recently demonstrated promise for enhancing the diagnosis and risk
prediction of stroke. However, existing methods rely on costly medical imaging
modalities, such as computed tomography. Recent studies suggest that retinal
imaging could offer a cost-effective alternative for cerebrovascular health
assessment due to the shared clinical pathways between the retina and the
brain. Hence, this study explores the impact of leveraging retinal images and
clinical data for stroke detection and risk prediction. We propose a multimodal
deep neural network that processes Optical Coherence Tomography (OCT) and
infrared reflectance retinal scans, combined with clinical data, such as
demographics, vital signs, and diagnosis codes. We pretrained our model using a
self-supervised learning framework using a real-world dataset consisting of
$37$ k scans, and then fine-tuned and evaluated the model using a smaller
labeled subset. Our empirical findings establish the predictive ability of the
considered modalities in detecting lasting effects in the retina associated
with acute stroke and forecasting future risk within a specific time horizon.
The experimental results demonstrate the effectiveness of our proposed
framework by achieving $5$\% AUROC improvement as compared to the unimodal
image-only baseline, and $8$\% improvement compared to an existing
state-of-the-art foundation model. In conclusion, our study highlights the
potential of retinal imaging in identifying high-risk patients and improving
long-term outcomes.