StackTHP: A stacking ensemble model for accurate prediction of tumor-homing peptides in cancer therapy.

Journal: Computers in biology and medicine
PMID:

Abstract

The tumor-homing peptides (THPs) have emerged as one of the attractive resources for targeted cancer therapy, being able to bind and penetrate tumor cells selectively while ignoring adjacent healthy tissues. Therefore, the computational models to predict THPs became popular very rapidly, since laboratory methods are slow and resourceful. Herein, we are proposing StackTHP, a newly developed stacking-ensemble model aimed at further improving THP prediction accuracy. StackTHP implements multiple feature extraction methods, including amino acid composition (AAC), and pseudo amino acid composition (PAAC) together with classical machine learning classifiers like Extra Trees, Random Forest, and AdaBoost, while the logistic regression-based meta-classifier is used for the stacking framework. StackTHP outperformed all other models, producing an accuracy of 91.92 %, Matthew's correlation coefficient (MCC) of 0.8415, AUC of 0.977 on benchmark datasets, indicates that it is better than approaches attempted earlier and provides a robust solution for proceeding towards the discovery and development of peptide-based cancer therapies. Future research will focus on the application of StackTHP over more diverse sets of data along with some hybrid methods to enhance the prediction capability. The dataset and the code are available at the following link: https://github.com/Ashikur562/StackTHP.

Authors

  • Fazla Rabby Raihan
    Department of Software Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh. Electronic address: fazla35-436@diu.edu.bd.
  • Lway Faisal Abdulrazak
    Department of Space Technology Engineering, Electrical Engineering Technical College, Middle Technical University, Baghdad, Iraq.
  • Md Ashikur Rahman
    Department of Software Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh. Electronic address: ashikur35-562@diu.edu.bd.
  • Md Mamun Ali
    Department of Software Engineering (SWE), Daffodil International University (DIU), Sukrabad, Dhaka, 1207, Bangladesh.
  • Sobhy M Ibrahim
    Department of Biochemistry, College of Science, King Saud University, P.O. Box: 2455, Riyadh, 11451, Saudi Arabia. Electronic address: syakout@ksu.edu.sa.
  • Kawsar Ahmed
    Group of Biophotomatiχ, Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail-1902, Bangladesh; Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh. Electronic address: kawsar.ict@mbstu.ac.bd.
  • Francis M Bui
    Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada.
  • Imran Mahmud
    Department of Software Engineering, Daffodil International University, Daffodil Smart City (DSC), Savar, Dhaka, Bangladesh.