Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.

Journal: Communications medicine

Published Date: Aug 2, 2025

Abstract

BACKGROUND: Large language models (LLMs) show promise in clinical contexts but can generate false facts (often referred to as "hallucinations"). One subset of these errors arises from adversarial attacks, in which fabricated details embedded in prompts lead the model to produce or elaborate on the false information. We embedded fabricated content in clinical prompts to elicit adversarial hallucination attacks in multiple large language models. We quantified how often they elaborated on false details and tested whether a specialized mitigation prompt or altered temperature settings reduced errors.

Authors

Mahmud Omar

Tel-aviv university, Faculty of medicine, Tel-Aviv, Israel. Electronic address: Mahmudomar70@gmail.com.
Vera Sorin

Department of Diagnostic Imaging, Chaim Sheba Medical Center, Tel Hashomer, Israel.
Jeremy D Collins

Department of Radiology, Mayo Clinic, 200 1stSt SW, Rochester, MN, 55902, USA.
David Reich

Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA.
Robert Freeman

Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, 1 Gustave L Levy Pl, New York, NY 10029, USA.
Nicholas Gavin

Department of Emergency Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Alexander Charney

The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Medical Center, New York, NY, USA.
Lisa Stump

Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Nicola Luigi Bragazzi

Laboratory for Industrial and Applied Mathematics, Department of Mathematics and Statistics, York University, Toronto, ON, Canada.
Girish N Nadkarni

Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Eyal Klang

Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40753316)

Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals