Predicting Immunotherapy Response in Unresectable Hepatocellular Carcinoma: A Comparative Study of Large Language Models and Human Experts.

Journal: Journal of medical systems
Published Date:

Abstract

Hepatocellular carcinoma (HCC) is an aggressive cancer with limited biomarkers for predicting immunotherapy response. Recent advancements in large language models (LLMs) like GPT-4, GPT-4o, and Gemini offer the potential for enhancing clinical decision-making through multimodal data analysis. However, their effectiveness in predicting immunotherapy response, especially compared to human experts, remains unclear. This study assessed the performance of GPT-4, GPT-4o, and Gemini in predicting immunotherapy response in unresectable HCC, compared to radiologists and oncologists of varying expertise. A retrospective analysis of 186 patients with unresectable HCC utilized multimodal data (clinical and CT images). LLMs were evaluated with zero-shot prompting and two strategies: the 'voting method' and the 'OR rule method' for improved sensitivity. Performance metrics included accuracy, sensitivity, area under the curve (AUC), and agreement across LLMs and physicians.GPT-4o, using the 'OR rule method,' achieved 65% accuracy and 47% sensitivity, comparable to intermediate physicians but lower than senior physicians (accuracy: 72%, p = 0.045; sensitivity: 70%, p < 0.0001). Gemini-GPT, combining GPT-4, GPT-4o, and Gemini, achieved an AUC of 0.69, similar to senior physicians (AUC: 0.72, p = 0.35), with 68% accuracy, outperforming junior and intermediate physicians while remaining comparable to senior physicians (p = 0.78). However, its sensitivity (58%) was lower than senior physicians (p = 0.0097). LLMs demonstrated higher inter-model agreement (κ = 0.59-0.70) than inter-physician agreement, especially among junior physicians (κ = 0.15). This study highlights the potential of LLMs, particularly Gemini-GPT, as valuable tools in predicting immunotherapy response for HCC.

Authors

  • Jun Xu
    Department of Nephrology, The Affiliated Baiyun Hospital of Guizhou Medical University, Guizhou, China.
  • Junjie Wang
    School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
  • Junjun Li
  • Zhangxiang Zhu
    Department of Radiology, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, P. R. China.
  • Xiao Fu
    State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai University, Nanjing 210098, China. fuxiaohhu@163.com.
  • Wei Cai
    Department of Gastrointestinal Surgery, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Ruipeng Song
    Department of Hepatobiliary Surgery, Division of Life Sciences and Medicine, Anhui Province Key Laboratory of Hepatopancreatobiliary Surgery, Anhui Provincial Clinical Research Center for Hepatobiliary Diseases, The First Affiliated Hospital of USTC, The University of Science and Technology of China, Hefei, 230001, P. R. China.
  • Tengfei Wang
    Department of Cardiology, The First Affiliated Hospital of Anhui Medical University, Hefei, China.
  • Hai Li
    School of Economics and Management, Shanghai University of Sport, Shanghai, China.