Machine learning outperforms large language models for survival prediction in advanced hepatocellular carcinoma: a multicenter study.
Journal:
Scientific reports
Published Date:
Apr 30, 2026
Abstract
Accurate prognostic prediction remains a critical unmet need in advanced hepatocellular carcinoma (HCC). While machine learning (ML) models have demonstrated value in outcome prediction, the ability of large language models (LLMs) to perform structured clinical prognostic tasks remains unclear. In this multicenter retrospective study, 1031 patients with HCC receiving interventional therapy combined with targeted treatment were analyzed and randomly divided into training (n = 717) and test (n = 314) cohorts. Six ML algorithms were developed and compared with two LLMs (ChatGPT-4o and DeepSeek-v3). Seven variables, including age, comorbidities, albumin-bilirubin grade, tumor burden, portal vein tumor thrombus, and alpha-fetoprotein level, were identified as key predictors. In the test cohort, SVM achieved the highest performance (AUC = 0.658), followed by XGBoost (AUC = 0.654), whereas LLMs showed limited discriminative ability (AUC = 0.590-0.591; P < 0.05 vs ML). ML models effectively stratified patients into risk groups with significantly different 1-year survival, while LLM-based predictions failed to distinguish outcomes. These findings indicate that ML models outperform current LLMs in structured prognostic prediction and provide more reliable support for risk stratification and clinical decision-making in advanced HCC.
Authors
Keywords
No keywords available for this article.