Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study.

Journal: JMIR medical education

Published Date: Jun 16, 2025

Abstract

Bing Chat (subsequently renamed Microsoft Copilot)-a ChatGPT 4.0-based large language model-demonstrated comparable performance to medical students in answering essay-style concept appraisals, while assessors struggled to differentiate artificial intelligence (AI) responses from human responses. These results highlight the need to prepare students and educators for a future world of AI by fostering reflective learning practices and critical thinking.

Authors

Seysha Mehta

Class of 2027, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, OH, USA.
Eliot N Haddad

Cleveland Clinic Lerner College of Medicine, School of Medicine, Case Western Reserve University, 9500 Euclid Ave, G10, Cleveland, OH, 44195, United States, 1 2164456512, 1 2164451007.
Indira Bhavsar Burke

Department of Internal Medicine, The University of Texas Southwestern Medical Center, Dallas, TX, United States.
Alana K Majors

Cleveland Clinic Lerner College of Medicine, School of Medicine, Case Western Reserve University, 9500 Euclid Ave, G10, Cleveland, OH, 44195, United States, 1 2164456512, 1 2164451007.
Rie Maeda

Cleveland Clinic Lerner College of Medicine, School of Medicine, Case Western Reserve University, 9500 Euclid Ave, G10, Cleveland, OH, 44195, United States, 1 2164456512, 1 2164451007.
Sean M Burke

Department of Internal Medicine, The University of Texas Southwestern Medical Center, Dallas, TX, United States.
Abhishek Deshpande

Department of Mathematics, University of Wisconsin-Madison, Madison, WI, USA.
Amy S Nowacki

Cleveland Clinic, USA.
Christina C Lindenmeyer

Cleveland Clinic Lerner College of Medicine, School of Medicine, Case Western Reserve University, 9500 Euclid Ave, G10, Cleveland, OH, 44195, United States, 1 2164456512, 1 2164451007.
Neil Mehta

Cleveland Clinic, USA.

Keywords

Artificial Intelligence Education, Medical, Undergraduate Educational Measurement Humans Language Large Language Models Schools, Medical Students, Medical

External Resources

View on PubMed Access via DOI PubMed (40523238)

Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals