Specialized Large Language Model Outperforms Neurologists at Complex Diagnosis in Blinded Case-Based Evaluation.

Journal: Brain sciences

Published Date: Mar 27, 2025

Abstract

: Artificial intelligence (AI), particularly large language models (LLMs), has demonstrated versatility in various applications but faces challenges in specialized domains like neurology. This study evaluates a specialized LLM's capability and trustworthiness in complex neurological diagnosis, comparing its performance to neurologists in simulated clinical settings. : We deployed GPT-4 Turbo (OpenAI, San Francisco, CA, US) through Neura (Sciense, New York, NY, US), an AI infrastructure with a dual-database architecture integrating "long-term memory" and "short-term memory" components on a curated neurological corpus. Five representative clinical scenarios were presented to 13 neurologists and the AI system. Participants formulated differential diagnoses based on initial presentations, followed by definitive diagnoses after receiving conclusive clinical information. Two senior academic neurologists blindly evaluated all responses, while an independent investigator assessed the verifiability of AI-generated information. : AI achieved a significantly higher normalized score (86.17%) compared to neurologists (55.11%, < 0.001). For differential diagnosis questions, AI scored 85% versus 46.15% for neurologists, and for final diagnosis, 88.24% versus 70.93%. AI obtained 15 maximum scores in its 20 evaluations and responded in under 30 s compared to neurologists' average of 9 min. All AI-provided references were classified as relevant with no hallucinatory content detected. : A specialized LLM demonstrated superior diagnostic performance compared to practicing neurologists across complex clinical challenges. This indicates that appropriately harnessed LLMs with curated knowledge bases can achieve domain-specific relevance in complex clinical disciplines, suggesting potential for AI as a time-efficient asset in clinical practice.

Authors

Sami Barrit

Department of Neurosurgery, CHU Tivoli, 7110 La Louvière, Belgium.
Nathan Torcida

Department of Neurology, Hôpital Universitaire de Bruxelles (HUB), 1070 Brussels, Belgium.
Aurelien Mazeraud

Anesthésie-Réanimation, GHU Paris, Pôle Neuro, 75014 Paris, France.
Sebastien Boulogne

Neurophysiology and Epileptology, Universite de Lyon, 69007 Lyon, France.
Jeanne Benoit

Neurology, CHU de Nice, Université Côte d'Azur, UMR2CA, 06000 Nice, France.
Timothée Carette

Neurology, Université Catholique de Louvain, Clinique Saint-Pierre Ottignies, 1348 Louvain-la-Neuve, Belgium.
Thibault Carron

LIP6, CNRS, Sorbonne Université, 75005 Paris, France.
Bertil Delsaut

Department of Neurology, Tivoli Hospital, La Louvière, Belgium.
Eva Diab

Clinical Neurophysiology, CHU Amiens Picardie, CHIMERE UR 7516 UPJV, 80054 Amiens, France.
Hugo Kermorvant

Neurophy Lab, Université Libre de Bruxelles, 1050 Brussels, Belgium.
Adil Maarouf

Neurology, La Timone Hospital, AP-HM, 13385 Marseille, France.
Sofia Maldonado Slootjes

Department of Neurology, Universitair Ziekenhuis Brussel (UZ Brussel), 1090 Brussels, Belgium.
Sylvain Redon

Evaluation and Treatment of Pain, FHU INOVPAIN, La Timone Hospital, AP-HM, 13385 Marseille, France.
Alexis Robin

Neurology, CHU Grenoble, 38700 Grenoble, France.
Sofiene Hadidane

Cabinets de Neurologie d'Allauch et Plan de Cuques, 13190 Allauch, France.
Vincent Harlay

Neuro-Oncology, AMU, La Timone Hospital, AP-HM, 13005 Marseille, France.
Vito Tota

Neurology, CHU Helora, 7000 Mons, Belgium.
Tanguy Madec

Neurology, Hospital of Noumea, 98800 Nouméa, France.
Alexandre Niset

Sciense, New York, NY 10013, USA.
Mejdeddine Al Barajraji

Department of Neurosurgery, University Hospital of Lausanne and University of Lausanne, 1005 Lausanne, Switzerland.
Joseph R Madsen

Neurodynamics Laboratory, Department of Neurosurgery, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA.
Salim El Hadwe

Neurosurgery, Université Libre de Bruxelles, 1070 Brussels, Belgium.
Nicolas Massager

Department of Neurosurgery, CHU Tivoli, 7110 La Louvière, Belgium.
Stanislas Lagarde

AMU, INSERM, Institut Neuroscience des Systèmes (INS), 13005 Marseille, France.
Romain Carron

Sciense, New York, NY 10013, USA.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40309809)

Specialized Large Language Model Outperforms Neurologists at Complex Diagnosis in Blinded Case-Based Evaluation.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals