BertTCR: a Bert-based deep learning framework for predicting cancer-related immune status based on T cell receptor repertoire.

Journal: Briefings in bioinformatics
Published Date:

Abstract

The T cell receptor (TCR) repertoire is pivotal to the human immune system, and understanding its nuances can significantly enhance our ability to forecast cancer-related immune responses. However, existing methods often overlook the intra- and inter-sequence interactions of T cell receptors (TCRs), limiting the development of sequence-based cancer-related immune status predictions. To address this challenge, we propose BertTCR, an innovative deep learning framework designed to predict cancer-related immune status using TCRs. BertTCR combines a pre-trained protein large language model with deep learning architectures, enabling it to extract deeper contextual information from TCRs. Compared to three state-of-the-art sequence-based methods, BertTCR improves the AUC on an external validation set for thyroid cancer detection by 21 percentage points. Additionally, this model was trained on over 2000 publicly available TCR libraries covering 17 types of cancer and healthy samples, and it has been validated on multiple public external datasets for its ability to distinguish cancer patients from healthy individuals. Furthermore, BertTCR can accurately classify various cancer types and healthy individuals. Overall, BertTCR is the advancing method for cancer-related immune status forecasting based on TCRs, offering promising potential for a wide range of immune status prediction tasks.

Authors

  • Min Zhang
    Department of Infectious Disease, The Second Xiangya Hospital of Central South University, Changsha, China.
  • Qi Cheng
    Institute of Intelligent System and Bioinformatics, College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.
  • Zhenyu Wei
    Institute of Intelligent System and Bioinformatics, College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China.
  • Jiayu Xu
    College of Computer Science and Technology, Jilin University, 130012 Changchun, China.
  • Shiwei Wu
    Intelligent Systems Science and Engineering College, Harbin Engineering University, Liaoyuan Street, Harbin, 150006, Heilongjiang Province, People's Republic of China.
  • Nan Xu
    Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
  • Chengkui Zhao
    College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.
  • Lei Yu
    School of Urban and Environmental Sciences, Central China Normal University, Wuhan 430079, China; Key Laboratory for Geographical Process Analysis & Simulation of Hubei Province, Central China Normal University, Wuhan 430079, China.
  • Weixing Feng
    Institute of Intelligent System and Bioinformatics, College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.