PredAPP: Predicting Anti-Parasitic Peptides with Undersampling and Ensemble Approaches.

Journal: Interdisciplinary sciences, computational life sciences
Published Date:

Abstract

Anti-parasitic peptides (APPs) have been regarded as promising therapeutic candidate drugs against parasitic diseases. Due to the fact that the experimental techniques for identifying APPs are expensive and time-consuming, there is an urgent need to develop a computational approach to predict APPs on a large scale. In this study, we provided a computational method, termed PredAPP (Prediction of Anti-Parasitic Peptides) that could effectively identify APPs using an ensemble of well-performed machine learning (ML) classifiers. Firstly, to solve the class imbalance problem, a balanced training dataset was generated by the undersampling method. We found that the balanced dataset based on cluster centroid achieved the best performance. Then, nine groups of features and six ML algorithms were combined to generate 54 classifiers and the output of these classifiers formed 54 feature representations, and in each feature group, we selected the feature representation with best performance for classification. Finally, the selected feature representations were integrated using logistic regression algorithm to construct the prediction model PredAPP. On the independent dataset, PredAPP achieved accuracy and AUC of 0.880 and 0.922, respectively, compared to 0.739 and 0.873 of AMPfun, a state-of-the-art method to predict APPs. The web server of PredAPP is freely accessible at http://predapp.xialab.info and https://github.com/xialab-ahu/PredAPP .

Authors

  • Wei Zhang
    The First Affiliated Hospital of Nanchang University, Nanchang, China.
  • Enhua Xia
    State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, Anhui, China.
  • Ruyu Dai
    Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
  • Wending Tang
    Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
  • Yannan Bin
    Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
  • Junfeng Xia