GPT-4 versus human authors in clinically complex MCQ creation: A blinded analysis of item quality.

Journal: Medical teacher
Published Date:

Abstract

PURPOSE: To compare the structural quality of multiple choice questions (MCQs) generated by a large language model, a type of artificial intelligence (AI), GPT-4, against human-authored items at both novice and expert level.

Authors

  • Hannah Wu
    Adelaide Medical School, University of Adelaide, Adelaide, Australia.
  • Toby Zerner
    Faculty of Health and Medical Sciences, University of Adelaide, Australia (T.Z., T.K., J.J.).
  • Daniel Lee
    Medical College of Georgia, Augusta University, 1120 15th St. Augusta, GA 30912, USA.
  • Stefan Court-Kowalski
    Adelaide Medical School, University of Adelaide, Adelaide, Australia.
  • Peter Devitt
    eMedici, Adelaide, Australia.
  • Edward Palmer
    Bloomsbury Institute of Intensive Care Medicine, University College London, London, UK. edward.palmer@ucl.ac.uk.

Keywords

No keywords available for this article.