pACP-HybDeep: predicting anticancer peptides using binary tree growth based transformer and structural feature encoding with deep-hybrid learning.

Journal: Scientific reports
PMID:

Abstract

Worldwide, Cancer remains a significant health concern due to its high mortality rates. Despite numerous traditional therapies and wet-laboratory methods for treating cancer-affected cells, these approaches often face limitations, including high costs and substantial side effects. Recently the high selectivity of peptides has garnered significant attention from scientists due to their reliable targeted actions and minimal adverse effects. Furthermore, keeping the significant outcomes of the existing computational models, we propose a highly reliable and effective model namely, pACP-HybDeep for the accurate prediction of anticancer peptides. In this model, training peptides are numerically encoded using an attention-based ProtBERT-BFD encoder to extract semantic features along with CTDT-based structural information. Furthermore, a k-nearest neighbor-based binary tree growth (BTG) algorithm is employed to select an optimal feature set from the multi-perspective vector. The selected feature vector is subsequently trained using a CNN + RNN-based deep learning model. Our proposed pACP-HybDeep model demonstrated a high training accuracy of 95.33%, and an AUC of 0.97. To validate the generalization capabilities of the model, our pACP-HybDeep model achieved accuracies of 94.92%, 92.26%, and 91.16% on independent datasets Ind-S1, Ind-S2, and Ind-S3, respectively. The demonstrated efficacy, and reliability of the pACP-HybDeep model using test datasets establish it as a valuable tool for researchers in academia and pharmaceutical drug design.

Authors

  • Shahid
    Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan.
  • Maqsood Hayat
    Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP, Pakistan. Electronic address: m.hayat@awkum.edu.pk.
  • Wajdi Alghamdi
    Data Science & Soft Computing Lab, and Department of Computing, Goldsmiths, University of London, UK.
  • Shahid Akbar
    Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan.
  • Ali Raza
    Department of Medical Microbiology and Clinical Microbiology, Near East University, Cyprus.
  • Rabiah Abdul Kadir
    Institute of Visual Informatics, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia. rabiahivi@ukm.edu.my.
  • Mahidur R Sarker
    Institute of Visual Informatics, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia.