SigPhi-Med: A lightweight vision-language assistant for biomedicine.

Journal: Journal of biomedical informatics
Published Date:

Abstract

BACKGROUND: Recent advancements in general multimodal large language models (MLLMs) have led to substantial improvements in the performance of biomedical MLLMs across diverse medical tasks, exhibiting significant transformative potential. However, the large number of parameters in MLLMs necessitates substantial computational resources during both training and inference stages, thereby limiting their feasibility in resource-constrained clinical settings. This study aims to develop a lightweight biomedical multimodal small language model (MSLM) to mitigate this limitation.

Authors

  • Feizhong Zhou
    College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401120, China.
  • Xingyue Liu
    College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401120, China.
  • Qiao Zeng
    Jingtai Technology Co. Ltd Floor 4, No. 9, Yifenghua Industrial Zone, 91 Huaning Road, Longhua District Shenzhen Guangdong Province 518109 China.
  • Zhuhan Li
    College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401120, China.
  • Hanguang Xiao
    Chongqing Key Laboratory of Modern Photoelectric Detection Technology and Instrument, School of Optoelectronic Information, Chongqing University of Technology, No. 69 Hongguang Road, Banan District, Chongqing 400050, PR China. Electronic address: simenxiao1211@163.com.