Performance of three artificial intelligence (AI)-based large language models in standardized testing; implications for AI-assisted dental education.

Journal: Journal of periodontal research
PMID:

Abstract

INTRODUCTION: The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP).

Authors

  • Hamoun Sabri
    Department of Periodontics and Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, Michigan, USA.
  • Muhammad H A Saleh
    Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, Michigan, USA.
  • Parham Hazrati
    Department of Periodontics and Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, Michigan, USA.
  • Keith Merchant
    Naval Post-Graduate Dental School, Bethesda, Maryland, USA.
  • Jonathan Misch
    Department of Periodontics and Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, Michigan, USA.
  • Purnima S Kumar
    Department of Periodontics and Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, Michigan, USA.
  • Hom-Lay Wang
    Department of Periodontics and Oral Medicine, the University of Michigan School of Dentistry, Ann Arbor, MI 48109.
  • Shayan Barootchi
    Department of Periodontics and Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, Michigan, USA.