Deep-ProBind: binding protein prediction with transformer-based deep learning model.

Journal: BMC bioinformatics

PMID: 40121399

Abstract

Binding proteins play a crucial role in biological systems by selectively interacting with specific molecules, such as DNA, RNA, or peptides, to regulate various cellular processes. Their ability to recognize and bind target molecules with high specificity makes them essential for signal transduction, transport, and enzymatic activity. Traditional experimental methods for identifying protein-binding peptides are costly and time-consuming. Current sequence-based approaches often struggle with accuracy, focusing too narrowly on proximal sequence features and ignoring structural data. This study presents Deep-ProBind, a powerful prediction model designed to classify protein binding sites by integrating sequence and structural information. The proposed model employs a transformer and evolutionary-based attention mechanism, i.e., Bidirectional Encoder Representations from Transformers (BERT) and Pseudo position specific scoring matrix -Discrete Wavelet Transform (PsePSSM -DWT) approach to encode peptides. The SHapley Additive exPlanations (SHAP) algorithm selects the optimal hybrid features, and a Deep Neural Network (DNN) is then used as the classification algorithm to predict protein-binding peptides. The performance of the proposed model was evaluated in comparison with traditional Machine Learning (ML) algorithms and existing models. Experimental results demonstrate that Deep-ProBind achieved 92.67% accuracy with tenfold cross-validation on benchmark datasets and 93.62% accuracy on independent samples. The Deep-ProBind outperforms existing models by 3.57% on training data and 1.52% on independent tests. These results demonstrate Deep-ProBind's reliability and effectiveness, making it a valuable tool for researchers and a potential resource in pharmacological studies, where peptide binding plays a critical role in therapeutic development.

Authors

Salman Khan
Sumaiya Noor

Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA.
Hamid Hussain Awan

Department of Computer Science, Muslim Youth University, Islamabad, Pakistan.
Shehryar Iqbal

School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, UK.
Salman A AlQahtani

Research Chair of Pervasive and Mobile Computing, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11574, Saudi Arabia.
Naqqash Dilshad

Department of Computer Science & Engineering, Sejong University, Seoul, 05006, South Korea.
Nijad Ahmad

Department of Computer Science, Khurasan University Jalalabad, Jalalabad, Afghanistan. Nijad@khurasan.edu.af.

Keywords

Algorithms Binding Sites Computational Biology Databases, Protein Deep Learning Neural Networks, Computer Protein Binding Proteins Sequence Analysis, Protein

External Resources

View on PubMed Access via DOI PubMed (40121399)

Deep-ProBind: binding protein prediction with transformer-based deep learning model.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals