ChatGPT versus human in generating medical graduate exam multiple choice questions-A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom).

Journal: PloS one
PMID:

Abstract

INTRODUCTION: Large language models, in particular ChatGPT, have showcased remarkable language processing capabilities. Given the substantial workload of university medical staff, this study aims to assess the quality of multiple-choice questions (MCQs) produced by ChatGPT for use in graduate medical examinations, compared to questions written by university professoriate staffs based on standard medical textbooks.

Authors

  • Billy Ho Hung Cheung
    L.K.S. Faculty of Medicine, University of Hong Kong, Hong Kong, Hong Kong S.A.R.
  • Gary Kui Kai Lau
    L.K.S. Faculty of Medicine, University of Hong Kong, Hong Kong, Hong Kong S.A.R.
  • Gordon Tin Chun Wong
    L.K.S. Faculty of Medicine, University of Hong Kong, Hong Kong, Hong Kong S.A.R.
  • Elaine Yuen Phin Lee
    L.K.S. Faculty of Medicine, University of Hong Kong, Hong Kong, Hong Kong S.A.R.
  • Dhananjay Kulkarni
    Department of Surgery, University of Edinburgh, Edinburgh, United Kingdom.
  • Choon Sheong Seow
    Department of Surgery, National University Cancer Institute Singapore, Singapore, Singapore.
  • Ruby Wong
    Department of Surgery, University of Galway, Galway, Ireland.
  • Michael Tiong-Hong Co
    L.K.S. Faculty of Medicine, University of Hong Kong, Hong Kong, Hong Kong S.A.R.