Metadata-enhanced contrastive learning from retinal optical coherence tomography images.

Journal: Medical image analysis
Published Date:

Abstract

Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.

Authors

  • Robbie Holland
    BioMedIA, Imperial College London, London, United Kingdom. Electronic address: robert.holland15@ic.ac.uk.
  • Oliver Leingang
    Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria.
  • Hrvoje Bogunović
    Christian Doppler Laboratory for Ophthalmic Image Analysis (OPTIMA), Department of Ophthalmology, Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria.
  • Sophie Riedl
    Christian Doppler Laboratory for Ophthalmic Image Analysis, Department of Ophthalmology, Medical University of Vienna, Vienna, Austria.
  • Lars Fritsche
    Department of Statistics, University of Michigan, Ann Arbor, Michigan, USA.
  • Toby Prevost
    Nightingale-Saunders Clinical Trials & Epidemiology Unit, King's College London, London, United Kingdom.
  • Hendrik P N Scholl
    Department of Ophthalmology, University of Basel, Basel, Switzerland.
  • Ursula Schmidt-Erfurth
    Christian Doppler Laboratory for Ophthalmic Image Analysis (OPTIMA), Department of Ophthalmology, Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria.
  • Sobha Sivaprasad
    Moorfields Eye Hospital City Road Campus, London, UK.
  • Andrew J Lotery
    Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, Hampshire, United Kingdom.
  • Daniel Rueckert
    Biomedical Image Analysis (BioMedIA) Group, Department of Computing, Imperial College London, UK. Electronic address: d.rueckert@imperial.ac.uk.
  • Martin J Menten
    BioMedIA, Imperial College London, London, United Kingdom; Institute for AI and Informatics in Medicine, Technical University of Munich, Munich, Bavaria, Germany.