Generalist large language models complement tailor-made predictors for tumor genomics interpretation

Journal: bioRxiv

Published Date: May 22, 2026

Abstract

General-purpose large language models (LLMs) are trained on large corpora to acquire broad knowledge, but whether LLMs can replace, or augment, task-specific models is unclear. We evaluated LLMs on three real-world, clinically important tumor genomic interpretation tasks, in order of increasing difficulty: (i) distinguishing tumor from non-tumor mutations (n=34,415 variants), (ii) distinguishing driver from passenger mutations (n=13,469 variants), and (iii) inferring cancer type from tumor sequencing reports across multiple assays and institutions (n=102,791 samples). The best general-purpose LLMs performed as well as the benchmark tailor-made predictor for task (i). Ensembling tailor-made models with zero-shot LLMs improved their performance for tasks (i) and (ii). For task (iii), LLMs outperformed or supplemented tailor-made models on out-of-distribution data. Without fine-tuning, current LLMs already can be useful in clinical genomic interpretation by adding complementary expertise to tailor-made, state-of-the-art predictors.

Authors

Yu
J.; Darmofal
M.; Waters
M.; Choy
J.; Tran
T. N.; Fu
C.; Morales
L.; U
K.; Levine
R. L.; Schultz
N.; Berger
M. F.; Morris
Q.; Jee
J.

External Resources

View on bioRxiv Access via DOI

Generalist large language models complement tailor-made predictors for tumor genomics interpretation

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Generalist large language models complement tailor-made predictors for tumor genomics interpretation

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals