Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.

Journal: BMC medical education

Published Date: Jun 26, 2024

Abstract

BACKGROUND: Artificial intelligence (AI) chatbots are emerging educational tools for students in healthcare science. However, assessing their accuracy is essential prior to adoption in educational settings. This study aimed to assess the accuracy of predicting the correct answers from three AI chatbots (ChatGPT-4, Microsoft Copilot and Google Gemini) in the Italian entrance standardized examination test of healthcare science degrees (CINECA test). Secondarily, we assessed the narrative coherence of the AI chatbots' responses (i.e., text output) based on three qualitative metrics: the logical rationale behind the chosen answer, the presence of information internal to the question, and presence of information external to the question.

Authors

Giacomo Rossettini

School of Physiotherapy, University of Verona, Verona, Italy. giacomo.rossettini@gmail.com.
Lia Rodeghiero

Department of Rehabilitation, Hospital of Merano (SABES-ASDAA), Teaching Hospital of Paracelsus Medical University (PMU), Merano-Meran, Italy. lia.rodeghiero@sabes.it.
Federica Corradi

School of Speech Therapy, University of Verona, Verona, Italy.
Chad Cook

Department of Orthopaedics, Duke University, Durham, NC, USA.
Paolo Pillastrini

Department of Biomedical and Neuromotor Sciences (DIBINEM), Alma Mater University of Bologna, Bologna, Italy.
Andrea Turolla

Department of Biomedical and Neuromotor Sciences (DIBINEM), Alma Mater University of Bologna, Bologna, Italy.
Greta Castellini

Unit of Clinical Epidemiology, IRCCS Istituto Ortopedico Galeazzi, Milan, Italy.
Stefania Chiappinotto

Department of Medical Sciences, University of Udine, Udine, Italy. stefania.chiappinotto@uniud.it.
Silvia Gianola

Unit of Clinical Epidemiology, IRCCS Istituto Ortopedico Galeazzi, Milan, Italy. silvia.gianola@grupposandonato.it.
Alvisa Palese

Department of Medical Sciences, University of Udine, Udine, Italy. Alvisa.palese@uniud.it.

Keywords

Artificial Intelligence Cross-Sectional Studies Educational Measurement Female Humans Italy Male

External Resources

View on PubMed Access via DOI PubMed (38926809)

Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals