Protein-protein interaction and site prediction using transfer learning.

Journal: Briefings in bioinformatics
Published Date:

Abstract

The advanced language models have enabled us to recognize protein-protein interactions (PPIs) and interaction sites using protein sequences or structures. Here, we trained the MindSpore ProteinBERT (MP-BERT) model, a Bidirectional Encoder Representation from Transformers, using protein pairs as inputs, making it suitable for identifying PPIs and their respective interaction sites. The pretrained model (MP-BERT) was fine-tuned as MPB-PPI (MP-BERT on PPI) and demonstrated its superiority over the state-of-the-art models on diverse benchmark datasets for predicting PPIs. Moreover, the model's capability to recognize PPIs among various organisms was evaluated on multiple organisms. An amalgamated organism model was designed, exhibiting a high level of generalization across the majority of organisms and attaining an accuracy of 92.65%. The model was also customized to predict interaction site propensity by fine-tuning it with PPI site data as MPB-PPISP. Our method facilitates the prediction of both PPIs and their interaction sites, thereby illustrating the potency of transfer learning in dealing with the protein pair task.

Authors

  • Tuoyu Liu
    State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, 100071, China.
  • Han Gao
    Zhejiang Construction Investment Environment Engineering Co, Ltd., Hangzhou, 310013, PR China.
  • Xiaopu Ren
    Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
  • Guoshun Xu
    Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
  • Bo Liu
    Wuhan United Imaging Healthcare Surgical Technology Co., Ltd., Wuhan, China.
  • Ningfeng Wu
    Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
  • Huiying Luo
    Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
  • Yuan Wang
    State Key Laboratory of Soil and Sustainable Agriculture, Changshu National Agro-Ecosystem Observation and Research Station, Institute of Soil Science, Chinese Academy of Sciences, Nanjing, China.
  • Tao Tu
    Google Research, Mountain View, CA, USA.
  • Bin Yao
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Texas, USA.
  • Feifei Guan
    Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
  • Yue Teng
    Haidian Maternal & Child Health Hospital Nutrition Clinic, Beijing 100080, China.
  • Huoqing Huang
    Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
  • Jian Tian