AI in the ED: Assessing the efficacy of GPT models vs. physicians in medical score calculation.

Journal: The American journal of emergency medicine
Published Date:

Abstract

BACKGROUND AND AIMS: Artificial Intelligence (AI) models like GPT-3.5 and GPT-4 have shown promise across various domains but remain underexplored in healthcare. Emergency Departments (ED) rely on established scoring systems, such as NIHSS and HEART score, to guide clinical decision-making. This study aims to evaluate the proficiency of GPT-3.5 and GPT-4 against experienced ED physicians in calculating five commonly used medical scores.

Authors

  • Gal Ben Haim
    Department of Emergency Medicine, Sheba Medical Center, Ramat-Gan, Israel; Tel Aviv University, Sackler Faculty of Medicine, Tel Aviv, Israel. Electronic address: galushbh@gmail.com.
  • Adi Braun
    Department of Emergency Medicine, Sheba Medical Center, Ramat-Gan, Israel; Tel Aviv University, Sackler Faculty of Medicine, Tel Aviv, Israel.
  • Haggai Eden
    Department of Emergency Medicine, Sheba Medical Center, Ramat-Gan, Israel; Tel Aviv University, Sackler Faculty of Medicine, Tel Aviv, Israel.
  • Livnat Burshtein
    Department of Emergency Medicine, Sheba Medical Center, Ramat-Gan, Israel.
  • Yiftach Barash
    Department of Diagnostic Imaging, Chaim Sheba Medical Center, Tel Hashomer, Israel.
  • Avinoah Irony
    Department of Emergency Medicine, Sheba Medical Center, Ramat-Gan, Israel; Tel Aviv University, Sackler Faculty of Medicine, Tel Aviv, Israel.
  • Eyal Klang
    Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.