Evaluating ChatGPT's recommendations for systematic treatment decisions in recurrent or metastatic head and neck squamous cell carcinoma: Perspectives from experts and junior doctors.

Journal: International journal of cancer
Published Date:

Abstract

This study evaluates ChatGPT-4's potential as a decision-support tool in the treatment of recurrent or metastatic head and neck squamous cell carcinoma (HNSCC). The study involved 12 retrospectively chosen patients with detailed clinical, tumor, treatment history, imaging, pathology, and symptomatic data. ChatGPT-4, along with six experts and 10 junior oncologists, assessed these cases. The AI model applied the 8th edition AJCC TNM criteria for tumor staging and proposed treatment strategies. Performance was quantitatively rated on a 0-100 scale by both expert and junior oncologists, with further analysis through statistical scoring and intraclass correlation coefficients. Findings revealed that ChatGPT-4 achieved an 83.3% accuracy rate in tumor staging with two instances of mis-staging. Junior doctors rated its staging performance highly, showing strong consensus on language capabilities and moderate on learning assistance. Experts rated ChatGPT-4's treatment strategy: high agreement on subject knowledge (median 86, mean 84.7), logical reasoning (median 83, mean 82), and analytical skills (median 85, mean 82); moderate on ChatGPT-4's usefulness for treatment decision (median 80, mean 77) and its recommendations (median 80, mean 76.8). Junior doctors rated ChatGPT-4 higher in treatment strategy (medians above 85) with limited consensus (subject knowledge: median 88, mean 84.5; logical reasoning: median 90, mean 83.2; analytical skills: median 90, mean 82.5; usefulness: median 85, mean 81.8; agreements for: median 85, mean 80.4). ChatGPT is proficient in tumor staging but moderately effective in treatment recommendations. Nonetheless, it shows promise as a supportive tool for clinicians, particularly for those with less experience, in making informed treatment decisions.

Authors

  • Danfang Yan
    Department of Radiation Oncology, the First Affiliated Hospital, College of Medicine, Zhejiang University, Zhejiang, Hangzhou, China.
  • Lihong Wang
  • Liming Huang
    Department of Oncology, The Affiliated People's Hospital, Fujian University of Traditional Chinese Medicine, Fuzhou, Fujian, China.
  • Kejia Cheng
    Department of Otolaryngology, the First Affiliated Hospital, College of Medicine, Zhejiang University, Zhejiang, Hangzhou, China.
  • Yu Huang
    School of Data Science and Software Engineering, Qingdao University, Qingdao 266021, China.
  • Yangyang Bao
    Department of Otolaryngology, the First Affiliated Hospital, College of Medicine, Zhejiang University, Zhejiang, Hangzhou, China.
  • Xin Yin
    3School of Software & Microelectronics, Peking University, Beijing, 102600 China.
  • Mengye He
    Department of Oncology, the First Affiliated Hospital, College of Medicine, Zhejiang University, Zhejiang, Hangzhou, China.
  • Huiyong Zhu
    Department of Oral and Maxillofacial Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, 79# Qingchun Road, Hangzhou, 310003, People's Republic of China. zhuhuiyong@zju.edu.cn.
  • SenXiang Yan
    Department of Radiation Oncology, The First Affiliated Hospital, Zhejiang University School of Medicine, No. 79 Qingchun Road, Hangzhou, 310003, Zhejiang, China; Cancer Center, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310027, Zhejiang, China. Electronic address: yansenxiang@zju.edu.cn.

Keywords

No keywords available for this article.