Generative AI chatbots for reliable cancer information: Evaluating web-search, multilingual, and reference capabilities of emerging large language models.

Journal: European journal of cancer (Oxford, England : 1990)
PMID:

Abstract

Recent advancements in large language models (LLMs) enable real-time web search, improved referencing, and multilingual support, yet ensuring they provide safe health information remains crucial. This perspective evaluates seven publicly accessible LLMs-ChatGPT, Co-Pilot, Gemini, MetaAI, Claude, Grok, Perplexity-on three simple cancer-related queries across eight languages (336 responses: English, French, Chinese, Thai, Hindi, Nepali, Vietnamese, and Arabic). None of the 42 English responses contained clinically meaningful hallucinations, whereas 7 of 294 non-English responses did. 48 % (162/336) of responses included valid references, but 39 % of the English references were.com links reflecting quality concerns. English responses frequently exceeded an eighth-grade level, and many non-English outputs were also complex. These findings reflect substantial progress over the past 2-years but reveal persistent gaps in multilingual accuracy, reliable reference inclusion, referral practices, and readability. Ongoing benchmarking is essential to ensure LLMs safely support global health information dichotomy and meet online information standards.

Authors

  • Bradley D Menz
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Natansh D Modi
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Ahmad Y Abuhelwa
    Department of Pharmacy Practice and Pharmacotherapeutics, College of Pharmacy, University of Sharjah, Sharjah, United Arab Emirates.
  • Warit Ruanglertboon
    Division of Health and Applied Sciences, Prince of Songkla University, Songkhla, Thailand; Research Center in Mathematics and Statistics with Applications, Discipline of Statistics, Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla, Thailand.
  • Agnes Vitry
    University of South Australia, Clinical and Health Sciences, Adelaide, Australia.
  • Yuan Gao
    Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou Zhejiang Province, China.
  • Lee X Li
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Rakchha Chhetri
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Bianca Chu
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Stephen Bacchi
    Faculty of Health and Medical Sciences, Adelaide Medical School, University of Adelaide, Adelaide, SA 5000 Australia.
  • Ganessan Kichenadasse
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia; Flinders Centre for Innovation in Cancer, Department of Medical Oncology, Flinders Medical Centre, Flinders University, Bedford Park, South Australia, Australia.
  • Adel Shahnam
    Department of Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
  • Andrew Rowland
    College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Adelaide, Australia.
  • Michael J Sorich
    Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia.
  • Ashley M Hopkins
    Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia.