Performance evaluation of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions: A comparative analysis.
Journal:
Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia
PMID:
39938135
Abstract
INTRODUCTION: Artificial intelligence (AI) has gained significant attention in medicine, particularly in neurosurgery, where its potential is often discussed and occasionally feared. Large language models (LLMs), such as ChatGPT-4.0 (OpenAI) and Gemini (formerly known as Bard, Google DeepMind), have shown promise in text-based tasks but remain under explored in image-based domains, which are essential for neurosurgery. This study evaluates the performance of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions, focusing on their ability to interpret visual data, a critical aspect of neurosurgical decision-making.