Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.

Journal: The Journal of the American Academy of Orthopaedic Surgeons
Published Date:

Abstract

INTRODUCTION: Artificial intelligence (AI) programs have the ability to answer complex queries including medical profession examination questions. The purpose of this study was to compare the performance of orthopaedic residents (ortho residents) against Chat Generative Pretrained Transformer (ChatGPT)-3.5 and GPT-4 on orthopaedic assessment examinations. A secondary objective was to perform a subgroup analysis comparing the performance of each group on questions that included image interpretation versus text-only questions.

Authors

  • Patrick A Massey
    From the Department of Orthopaedic Surgery, Louisiana State University Health Sciences Center Shreveport, Shreveport, LA.
  • Carver Montgomery
  • Andrew S Zhang
    Department of Orthopedic Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island, USA.