Multi-modal large language models in radiology: principles, applications, and potential.

Journal: Abdominal radiology (New York)
PMID:

Abstract

Large language models (LLMs) and multi-modal large language models (MLLMs) represent the cutting-edge in artificial intelligence. This review provides a comprehensive overview of their capabilities and potential impact on radiology. Unlike most existing literature reviews focusing solely on LLMs, this work examines both LLMs and MLLMs, highlighting their potential to support radiology workflows such as report generation, image interpretation, EHR summarization, differential diagnosis generation, and patient education. By streamlining these tasks, LLMs and MLLMs could reduce radiologist workload, improve diagnostic accuracy, support interdisciplinary collaboration, and ultimately enhance patient care. We also discuss key limitations, such as the limited capacity of current MLLMs to interpret 3D medical images and to integrate information from both image and text data, as well as the lack of effective evaluation methods. Ongoing efforts to address these challenges are introduced.

Authors

  • Yiqiu Shen
  • Yanqi Xu
    New York University, New York, USA.
  • Jiajian Ma
    New York University, New York, USA.
  • Wushuang Rui
    New York University, New York, USA.
  • Chen Zhao
    Department of Ophthalmology, Fudan Eye & ENT Hospital, Shanghai, China.
  • Laura Heacock
    Bernard and Irene Schwartz Center for Biomedical Imaging, Department of Radiology, New York University School of Medicine, New York, New York, USA.
  • Chenchan Huang
    Department of Radiology, NYU Grossman School of Medicine, New York, NY, USA.