Topology-Enhanced Machine Learning Model (Top-ML) for Anticancer Peptide Prediction.

Journal: Journal of chemical information and modeling
PMID:

Abstract

Recently, therapeutic peptides have demonstrated great promise for cancer treatment. To explore powerful anticancer peptides, artificial intelligence (AI)-based approaches have been developed to systematically screen potential candidates. However, the lack of efficient featurization of peptides has become a bottleneck for these machine-learning models. In this paper, we propose a topology-enhanced machine learning model (Top-ML) for anticancer peptide prediction. Our Top-ML employs peptide topological features derived from its sequence "connection" information characterized by spectral descriptors. Our Top-ML model, employing an Extra-Trees classifier, has been validated on the AntiCP 2.0 and mACPpred 2.0 benchmark data sets, achieving state-of-the-art performance or results comparable to existing deep learning models, while providing greater interpretability. Our results highlight the potential of leveraging novel topology-based featurization to accelerate the identification of anticancer peptides.

Authors

  • Joshua Zhi En Tan
    Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore.
  • JunJie Wee
    Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore.
  • Xue Gong
    Department of Biology and Medicine, China University of Mining and Technology of School of Chemical Engineering & Technology, Xuzhou, Jiangsu, China (mainland).
  • Kelin Xia
    Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore.