Vision-language model performance on the Japanese Nuclear Medicine Board Examination: high accuracy in text but challenges with image interpretation.

Journal: Annals of nuclear medicine
Published Date:

Abstract

OBJECTIVE: Vision language models (VLMs) allow visual input to Large Language Models. VLMs have been developing rapidly, and their accuracy is improving rapidly. Their performance in nuclear medicine compared to state-of-the-art models, including reasoning models, is not yet clear. We evaluated state-of-the-art VLMs using problems from the past Japan Nuclear Medicine Board Examination (JNMBE) and assessed their strengths and limitations.

Authors

  • Rintaro Ito
    Department of Innovative Biomedical Visualization, Nagoya University Graduate School of Medicine, Showa-ku, Nagoya, Japan.
  • Keita Kato
    Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan.
  • Marina Higashi
    Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan.
  • Yumi Abe
    Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan.
  • Ryogo Minamimoto
    Division of Nuclear Medicine, Department of Radiology, National Center for Global Health and Medicine, 1-21-1 Toyama, Shinjuku-ku, Tokyo, 162-8655, Japan.
  • Katsuhiko Kato
    Functional Medical Imaging, Biomedical Imaging Sciences, Division of Advanced Information Health Sciences, Department of Integrated Health Sciences, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan.
  • Toshiaki Taoka
    Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan.
  • Shinji Naganawa
    Department of Radiology, Nagoya University Graduate School of Medicine.

Keywords

No keywords available for this article.