ProtTrans and multi-window scanning convolutional neural networks for the prediction of protein-peptide interaction sites.

Journal: Journal of molecular graphics & modelling
Published Date:

Abstract

This study delves into the prediction of protein-peptide interactions using advanced machine learning techniques, comparing models such as sequence-based, standard CNNs, and traditional classifiers. Leveraging pre-trained language models and multi-view window scanning CNNs, our approach yields significant improvements, with ProtTrans standing out based on 2.1 billion protein sequences and 393 billion amino acids. The integrated model demonstrates remarkable performance, achieving an AUC of 0.856 and 0.823 on the PepBCL Set_1 and Set_2 datasets, respectively. Additionally, it attains a Precision of 0.564 in PepBCL Set 1 and 0.527 in PepBCL Set 2, surpassing the performance of previous methods. Beyond this, we explore the application of this model in cancer therapy, particularly in identifying peptide interactions for selective targeting of cancer cells, and other fields. The findings of this study contribute to bioinformatics, providing valuable insights for drug discovery and therapeutic development.

Authors

  • Van-The Le
    Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.
  • Zi-Jun Zhan
    Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.
  • Thi-Thu-Phuong Vu
    Graduate Program in Biomedical Informatics, Yuan Ze University, Chung-Li, 32003, Taiwan.
  • Muhammad-Shahid Malik
    Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; Department of Computer Science and Engineering, Karakoram International University, Pakistan.
  • Yu-Yen Ou
    Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan. Electronic address: yien@saturn.yzu.edu.tw.