Deep-ProBind: binding protein prediction with transformer-based deep learning model.

Journal: BMC bioinformatics
PMID:

Abstract

Binding proteins play a crucial role in biological systems by selectively interacting with specific molecules, such as DNA, RNA, or peptides, to regulate various cellular processes. Their ability to recognize and bind target molecules with high specificity makes them essential for signal transduction, transport, and enzymatic activity. Traditional experimental methods for identifying protein-binding peptides are costly and time-consuming. Current sequence-based approaches often struggle with accuracy, focusing too narrowly on proximal sequence features and ignoring structural data. This study presents Deep-ProBind, a powerful prediction model designed to classify protein binding sites by integrating sequence and structural information. The proposed model employs a transformer and evolutionary-based attention mechanism, i.e., Bidirectional Encoder Representations from Transformers (BERT) and Pseudo position specific scoring matrix -Discrete Wavelet Transform (PsePSSM -DWT) approach to encode peptides. The SHapley Additive exPlanations (SHAP) algorithm selects the optimal hybrid features, and a Deep Neural Network (DNN) is then used as the classification algorithm to predict protein-binding peptides. The performance of the proposed model was evaluated in comparison with traditional Machine Learning (ML) algorithms and existing models. Experimental results demonstrate that Deep-ProBind achieved 92.67% accuracy with tenfold cross-validation on benchmark datasets and 93.62% accuracy on independent samples. The Deep-ProBind outperforms existing models by 3.57% on training data and 1.52% on independent tests. These results demonstrate Deep-ProBind's reliability and effectiveness, making it a valuable tool for researchers and a potential resource in pharmacological studies, where peptide binding plays a critical role in therapeutic development.

Authors

  • Salman Khan
  • Sumaiya Noor
    Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA.
  • Hamid Hussain Awan
    Department of Computer Science, Muslim Youth University, Islamabad, Pakistan.
  • Shehryar Iqbal
    School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, UK.
  • Salman A AlQahtani
    Research Chair of Pervasive and Mobile Computing, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11574, Saudi Arabia.
  • Naqqash Dilshad
    Department of Computer Science & Engineering, Sejong University, Seoul, 05006, South Korea.
  • Nijad Ahmad
    Department of Computer Science, Khurasan University Jalalabad, Jalalabad, Afghanistan. Nijad@khurasan.edu.af.