Identification of Multi-functional Therapeutic Peptides Based on Prototypical Supervised Contrastive Learning.

Journal: Interdisciplinary sciences, computational life sciences
Published Date:

Abstract

High-throughput sequencing has exponentially increased peptide sequences, necessitating a computational method to identify multi-functional therapeutic peptides (MFTP) from their sequences. However, existing computational methods are challenged by class imbalance, particularly in learning effective sequence representations. To address this, we propose PSCFA, a prototypical supervised contrastive learning with a feature augmentation method for MFTP prediction. We employ a two-stage training scheme to train the feature extractor and the classifier respectively, underpinned by the principle that better feature representation boosts classification accuracy. In the first stage, we utilize a prototypical supervised contrastive learning strategy to enhance the uniformity of feature space distribution, ensuring that the characteristics of samples within the same category are tightly clustered while those from different categories are more dispersed. In the second stage, a feature augmentation strategy that focuses on infrequent labels (tail labels) is used to refine the learning process of the classifier. We use a prototype-based variational autoencoder to capture semantic links among common labels (head labels) and their prototypes. This knowledge is then transferred to tail labels, generating enhanced features for classifier training. The experiments prove that the PSCFA method significantly outperforms existing methods for MFTP prediction, making a significant advancement in therapeutic peptide identification.

Authors

  • Sitong Niu
    College of Mathematics and System sciences, Xinjiang University, Urumqi, 830046, Xinjiang, China.
  • Henghui Fan
    Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
  • Fei Wang
    Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY, United States.
  • Xiaomei Yang
    Department of Gynaecology, Huangdao District Chinese Medicine Hospital, Qingdao 266500, China.
  • Junfeng Xia