A systematic review of large language model (LLM) evaluations in clinical medicine.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Large Language Models (LLMs), advanced AI tools based on transformer architectures, demonstrate significant potential in clinical medicine by enhancing decision support, diagnostics, and medical education. However, their integration into clinical workflows requires rigorous evaluation to ensure reliability, safety, and ethical alignment.

Authors

  • Sina Shool
    Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran.
  • Sara Adimi
    Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, 1995614331, Iran.
  • Reza Saboori Amleshi
    Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran.
  • Ehsan Bitaraf
    Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran.
  • Reza Golpira
    Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran.
  • Mahmood Tara
    Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.