Prospective validation of deep-learning algorithms for diabetic retinopathy screening: A systematic review and meta-analysis.
Journal:
Survey of ophthalmology
Published Date:
Dec 2, 2025
Abstract
Deep-learning (DL) algorithms are widely promoted for diabetic-retinopathy (DR) screening, yet their prospective diagnostic accuracy is not well defined. PubMed, EMBASE and ClinicalTrials.gov were searched to April, 2025, for prospective evaluations of DL systems using color-fundus images. Two reviewers screened records, extracted data, and applied QUADAS-2. Hierarchical bivariate random-effects models produced pooled sensitivity and specificity for referable and vision-threatening DR), analyzed separately at patient and eye level. Twenty-one prespecified moderators were explored with uni- and multi-variate meta-regression; publication bias was assessed with Deeks' test Seventy-three studies from 23 countries (255,330 examinations) met the criteria. Pooled patient-level sensitivity was 0.94 (95 % CI 0.92-0.95) and specificity 0.90 (95 % CI 0.87-0.93); eye-level values were 0.93 (95 % CI 0.91-0.95) and 0.94 (95 % CI 0.92-0.96). DR subtype, retinal-field strategy, camera form factor, and prevalence independently explained heterogeneity (p < 0.05). Performance matched or exceeded pivotal FDA trials (IDx-DR, EyeArt). AI gradability was ≥ 95 % in 60 % of cohorts, including handheld and smartphone systems. DL-based DR screening achieves consistent, high accuracy across devices and care settings, enabling scalable deployment in primary care, pharmacies, and mobile clinics. Quality assurance and ongoing monitoring are essential to maximize population-level benefits.
Authors
Keywords
No keywords available for this article.