LLM-Generated Multiple Choice Practice Quizzes for Pre-Clinical Medical Students; Prevalence of Item Writing Flaws.

Journal: Advances in physiology education
Published Date:

Abstract

Multiple choice questions (MCQs) are frequently used in medical education for assessment. Automated generation of MCQs in board-exam format could potentially save significant effort for faculty and generate a wider set of practice materials for student use. The goal of this study was to explore the feasibility of using ChatGPT by OpenAI to generate USMLE/COMLEX-USA-style practice quiz items as study aids. Researchers gave second year medical students studying renal physiology access to a set of practice quizzes with ChatGPT generated questions. The exam items generated were evaluated by independent experts for quality and adherence to NBME/NBOME guidelines. Forty-nine percent of questions contained item writing flaws, and 22% contained factual or conceptual errors. However, 59/65 (91%) were categorized as a reasonable starting point for revision. These results demonstrate the feasibility of large language model (LLM)-generated practice questions in medical education, but only when supervised by a subject matter expert with training in exam item writing.

Authors

  • Troy Camarata
    Baptist University College of Osteopathic Medicine.
  • Lise McCoy
    Department of Academic Affairs, New York Institute of Technology College of Osteopathic Medicine (NYITCOM), Jonesboro, USA.
  • Robert L Rosenberg
    Drexel University College of Medicine.
  • Kelsey R Temprine Grellinger
    Western Michigan University Homer Stryker M.D. School of Medicine.
  • Kylie Brettschneider
    New York Institute of Technology College of Osteopathic Medicine.
  • Jonathan Berman
    New York Institute of Technology College of Osteopathic Medicine.

Keywords

No keywords available for this article.