Multimodal, Multi-Disease Medical Imaging Foundation Model (MerMED-FM)

Journal: arXiv
Published Date:

Abstract

Current artificial intelligence models for medical imaging are predominantly single modality and single disease. Attempts to create multimodal and multi-disease models have resulted in inconsistent clinical accuracy. Furthermore, training these models typically requires large, labour-intensive, well-labelled datasets. We developed MerMED-FM, a state-of-the-art multimodal, multi-specialty foundation model trained using self-supervised learning and a memory module. MerMED-FM was trained on 3.3 million medical images from over ten specialties and seven modalities, including computed tomography (CT), chest X-rays (CXR), ultrasound (US), pathology patches, color fundus photography (CFP), optical coherence tomography (OCT) and dermatology images. MerMED-FM was evaluated across multiple diseases and compared against existing foundational models. Strong performance was achieved across all modalities, with AUROCs of 0.988 (OCT); 0.982 (pathology); 0.951 (US); 0.943 (CT); 0.931 (skin); 0.894 (CFP); 0.858 (CXR). MerMED-FM has the potential to be a highly adaptable, versatile, cross-specialty foundation model that enables robust medical imaging interpretation across diverse medical disciplines.

Authors

  • Yang Zhou
  • Chrystie Wan Ning Quek
  • Jun Zhou
  • Yan Wang
  • Yang Bai
  • Yuhe Ke
  • Jie Yao
  • Laura Gutierrez
  • Zhen Ling Teo
  • Darren Shu Jeng Ting
  • Brian T. Soetikno
  • Christopher S. Nielsen
  • Tobias Elze
  • Zengxiang Li
  • Linh Le Dinh
  • Lionel Tim-Ee Cheng
  • Tran Nguyen Tuan Anh
  • Chee Leong Cheng
  • Tien Yin Wong
  • Nan Liu
  • Iain Beehuat Tan
  • Tony Kiat Hon Lim
  • Rick Siow Mong Goh
  • Yong Liu
  • Daniel Shu Wei Ting