LLM-Generated Multiple Choice Practice Quizzes for Pre-Clinical Medical Students; Prevalence of Item Writing Flaws.
Journal:
Advances in physiology education
Published Date:
Jun 14, 2025
Abstract
Multiple choice questions (MCQs) are frequently used in medical education for assessment. Automated generation of MCQs in board-exam format could potentially save significant effort for faculty and generate a wider set of practice materials for student use. The goal of this study was to explore the feasibility of using ChatGPT by OpenAI to generate USMLE/COMLEX-USA-style practice quiz items as study aids. Researchers gave second year medical students studying renal physiology access to a set of practice quizzes with ChatGPT generated questions. The exam items generated were evaluated by independent experts for quality and adherence to NBME/NBOME guidelines. Forty-nine percent of questions contained item writing flaws, and 22% contained factual or conceptual errors. However, 59/65 (91%) were categorized as a reasonable starting point for revision. These results demonstrate the feasibility of large language model (LLM)-generated practice questions in medical education, but only when supervised by a subject matter expert with training in exam item writing.
Authors
Keywords
No keywords available for this article.