StackHCV: a web-based integrative machine-learning framework for large-scale identification of hepatitis C virus NS5B inhibitors.

Journal: Journal of computer-aided molecular design
PMID:

Abstract

Fast and accurate identification of inhibitors with potency against HCV NS5B polymerase is currently a challenging task. As conventional experimental methods is the gold standard method for the design and development of new HCV inhibitors, they often require costly investment of time and resources. In this study, we develop a novel machine learning-based meta-predictor (termed StackHCV) for accurate and large-scale identification of HCV inhibitors. Unlike the existing method, which is based on single-feature-based approach, we first constructed a pool of various baseline models by employing a wide range of heterogeneous molecular fingerprints with five popular machine learning algorithms (k-nearest neighbor, multi-layer perceptron, partial least squares, random forest and support vectors machine). Secondly, we integrated these baseline models in order to develop the final meta-based model by means of the stacking strategy. Extensive benchmarking experiments showed that StackHCV achieved a more accurate and stable performance as compared to its constituent baseline models on the training dataset and also outperformed the existing predictor on the independent test dataset. To facilitate the high-throughput identification of HCV inhibitors, we built a web server that can be freely accessed at http://camt.pythonanywhere.com/StackHCV . It is expected that StackHCV could be a useful tool for fast and precise identification of potential drugs against HCV NS5B particularly for liver cancer therapy and other clinical applications.

Authors

  • Aijaz Ahmad Malik
    Center of Data Mining & Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
  • Warot Chotpatiwetchkul
    Applied Computational Chemistry Research Unit, Department of Chemistry, School of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok, 10520, Thailand.
  • Chuleeporn Phanus-Umporn
    Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
  • Chanin Nantasenamat
    Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand; Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
  • Phasit Charoenkwan
  • Watshara Shoombuatong
    Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.