Performance evaluation of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions: A comparative analysis.

Journal: Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia

PMID: 39938135

Abstract

INTRODUCTION: Artificial intelligence (AI) has gained significant attention in medicine, particularly in neurosurgery, where its potential is often discussed and occasionally feared. Large language models (LLMs), such as ChatGPT-4.0 (OpenAI) and Gemini (formerly known as Bard, Google DeepMind), have shown promise in text-based tasks but remain under explored in image-based domains, which are essential for neurosurgery. This study evaluates the performance of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions, focusing on their ability to interpret visual data, a critical aspect of neurosurgical decision-making.

Authors

Alana M McNulty

Department of Neurosurgery, Albany Medical Center, Albany, NY, USA.
Harshitha Valluri

Department of Neurosurgery, Albany Medical Center, Albany, NY, USA.
Avi A Gajjar

Department of Neurological Surgery, University of Pittsburgh Medical Center, Pittsburgh , Pennsylvania , USA.
Amanda Custozzo

Department of Neurosurgery, Albany Medical Center, Albany, NY, USA.
Nicholas C Field

Department of Neurosurgery, Albany Medical Center, Albany, NY, USA.
Alexandra R Paul

Department of Neurosurgery, Albany Medical Center, Albany, NY, USA. Electronic address: PaulA1@amc.edu.

Keywords

Artificial Intelligence Generative Artificial Intelligence Humans Magnetic Resonance Imaging Neurosurgery Neurosurgical Procedures Specialty Boards

External Resources

View on PubMed Access via DOI PubMed (39938135)

Performance evaluation of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions: A comparative analysis.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals