DeepPFP: a multi-task-aware architecture for protein function prediction.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Deriving protein function from protein sequences poses a significant challenge due to the intricate relationship between sequence and function. Deep learning has made remarkable strides in predicting sequence-function relationships. However, models tailored for specific tasks or protein types encounter difficulties when using transfer learning across domains. This is attributed to the fact that protein function relies heavily on structural characteristics rather than mere sequence information. Consequently, there is a pressing need for a model capable of capturing shared features among diverse sequence-function mapping tasks to address the generalization issue. In this study, we explore the potential of Model-Agnostic Meta-Learning combined with a protein language model called Evolutionary Scale Modeling to tackle this challenge. Our approach involves training the architecture on five out-domain deep mutational scanning (DMS) datasets and evaluating its performance across four key dimensions. Our findings demonstrate that the proposed architecture exhibits satisfactory performance in terms of generalization and employs an effective few-shot learning strategy. To explain further, Compared to the best results, the Pearson's correlation coefficient (PCC) in the final stage increased by ~0.31%. Furthermore, we leverage the trained architecture to predict binding affinity scores of the DMS dataset of SARS-CoV-2 using transfer learning. Notably, training on a subset of the Ube4b dataset with 500 samples resulted in a notable improvement of 0.11 in the PCC. These results underscore the potential of our conceptual architecture as a promising methodology for multi-task protein function prediction.

Authors

  • Han Wang
    Saw Swee Hock School of Public Health, National University Health System, National University of Singapore, Singapore.
  • Zilin Ren
    Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Jinghong Sun
    College of Information Science and Technology, Beijing University of Chemical Technology, No. 15 North Third Ring East Road, Chaoyang District, Beijing 100029, China.
  • Yongbing Chen
    School of Information Science and Technology, Northeast Normal University, Changchun, China.
  • Xiaochen Bo
    Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing 100850, China.
  • JiGuo Xue
    Advanced & Interdisciplinary Biotechnology, Academy of Military Medical Sciences, No. 27 Taiping Road, Haidian District, Beijing 100850, China.
  • Jingyang Gao
    Department of Computer Science and Technology, Beijing University of Chemical Technology, Beijing, China.
  • Ming Ni
    Department of Orthopaedics, Chinese People's Liberation Army General Hospital (301 Hospital), 28 Fuxing Rd, 100853, Beijing, China.