A multimodal visual-language foundation model for computational ophthalmology.

Journal: NPJ digital medicine
Published Date:

Abstract

Early detection of eye diseases is vital for preventing vision loss. Existing ophthalmic artificial intelligence models focus on single modalities, overlooking multi-view information and struggling with rare diseases due to long-tail distributions. We propose EyeCLIP, a multimodal visual-language foundation model trained on 2.77 million ophthalmology images from 11 modalities with partial clinical text. Our novel pretraining strategy combines self-supervised reconstruction, multimodal image contrastive learning, and image-text contrastive learning to capture shared representations across modalities. EyeCLIP demonstrates robust performance across 14 benchmark datasets, excelling in disease classification, visual question answering, and cross-modal retrieval. It also exhibits strong few-shot and zero-shot capabilities, enabling accurate predictions in real-world, long-tail scenarios. EyeCLIP offers significant potential for detecting both ocular and systemic diseases, and bridging gaps in real-world clinical applications.

Authors

  • Danli Shi
    State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
  • Weiyi Zhang
    Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China.
  • Jiancheng Yang
    Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, P.R. China.
  • Siyu Huang
    Department of Surgery, University of Melbourne, Parkville, Victoria, Australia.
  • Xiaolan Chen
    Jiangsu Agri-animal Husbandry Vocational College, Taizhou, 225300 China.
  • Pusheng Xu
    School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China.
  • Kai Jin
    Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.
  • Shan Lin
    Key Laboratory of Bioorganic Synthesis of Zhejiang Province, College of Biotechnology and Bioengineering, Zhejiang University of Technology, Hangzhou 310014, China.
  • Jin Wei
    Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, No. 100 Haining Road, Shanghai 20080, PR China.
  • Mayinuer Yusufu
    Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, Australia; Department of Surgery (Ophthalmology), The University of Melbourne, Melbourne, Australia.
  • Shunming Liu
    Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 510080, China.
  • Qing Zhang
    Department of Respiratory Medicine, Affiliated Zhongshan Hospital of Dalian University, Dalian, China.
  • Zongyuan Ge
    AIM for Health Lab, Faculty of IT, Monash University, Clayton, Victoria, Australia; Monash-Airdoc Research Lab, Faculty of IT, Monash University, Clayton, Victoria, Australia.
  • Xun Xu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Mingguang He
    State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China; Centre for Eye Research Australia; Departments of Ophthalmology and Surgery, University of Melbourne, Melbourne, Australia. Electronic address: mingguang.he@unimelb.edu.au.

Keywords

No keywords available for this article.