Artificial intelligence in neurovascular decision-making: a comparative analysis of ChatGPT-4 and multidisciplinary expert recommendations for unruptured intracranial aneurysms.

Journal: Neurosurgical review

PMID: 39982556

Abstract

In the multidisciplinary treatment of cerebrovascular diseases, specialists from different disciplines strive to develop patient-specific treatment recommendations. ChatGPT is a natural language processing chatbot with increasing applicability in medical practice. This study evaluates ChatGPT's ability to provide treatment recommendations for patients with unruptured intracranial aneurysms (UIA). Anonymized patient data and radiological reports of 20 patients with UIAs were provided to GPT-4 in a standardized format and used to generate a treatment recommendation for different clinical scenarios. GPT-4 responses were evaluated by a multidisciplinary panel of specialists by means of the Likert scale and subsequently benchmarked against the Unruptured Intracranial Aneurysm Treatment Score (UIATS) as well as the actual treatment decision made by the multidisciplinary institutional neurovascular board (INVB). Agreement between expert raters was measured using linear weighted Fleiss-Kappa coefficient. GPT-4 analyzed individual pathological features of the radiological reports and formulated a corresponding assessment for each aspect. None of the recommendations generated reflected evidence of factual hallucination, although in 25% of the case studies no specific recommendation could be derived from the GPT-4 responses. The expert panel rated the overall quality of the GPT-4 recommendations with a median of 3.4 out of 5 points. The GPT-4 recommendations were congruent with those of the INBI in 65% of cases. Interrater reliability among experts showed moderate to low agreement in the assessment of AI-assisted decision making. GPT-4 appears to be able to process clinical information about UIAs and generate treatment recommendations. However, the level of ambiguity and the utilization of scientific evidence in the recommendations are not yet patient/case specific enough to substitute the decision-making of a multidisciplinary neurovascular board. A prospective evaluation of GPT-4 competence as a companion in decision-making panels is deemed necessary.

Authors

Alexis Hadjiathanasiou

Department of Neurosurgery, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany. alexis.hadjiathanasiou@ukb.de.
Leonie Goelz

Department of Radiology and Neuroradiology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Florian Muhn

Department of Neurology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Rebecca Heinz

Department of Neurosurgery, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Lutz Kreissl

Department of Radiology and Neuroradiology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Paul Sparenberg

Department of Neurology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Johannes Lemcke

Department of Neurosurgery, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Ingo Schmehl

Department of Neurology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Sven Mutze

Department of Radiology and Neuroradiology, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.
Patrick Schuss

Department of Neurosurgery, BG Klinikum Unfallkrankenhaus Berlin, Berlin, Germany.

Keywords

Adult Aged Artificial Intelligence Clinical Decision-Making Decision Making Female Generative Artificial Intelligence Humans Intracranial Aneurysm Male Middle Aged Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (39982556)

Artificial intelligence in neurovascular decision-making: a comparative analysis of ChatGPT-4 and multidisciplinary expert recommendations for unruptured intracranial aneurysms.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals