NPClassifier: A Deep Neural Network-Based Structural Classification Tool for Natural Products.

Journal: Journal of natural products
Published Date:

Abstract

Computational approaches such as genome and metabolome mining are becoming essential to natural products (NPs) research. Consequently, a need exists for an automated structure-type classification system to handle the massive amounts of data appearing for NP structures. An ideal semantic ontology for the classification of NPs should go beyond the simple presence/absence of chemical substructures, but also include the taxonomy of the producing organism, the nature of the biosynthetic pathway, and/or their biological properties. Thus, a holistic and automatic NP classification framework could have considerable value to comprehensively navigate the relatedness of NPs, and especially so when analyzing large numbers of NPs. Here, we introduce NPClassifier, a deep-learning tool for the automated structural classification of NPs from their counted Morgan fingerprints. NPClassifier is expected to accelerate and enhance NP discovery by linking NP structures to their underlying properties.

Authors

  • Hyun Woo Kim
    Chemical Data-Driven Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon 34114, Korea.
  • Mingxun Wang
    Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States.
  • Christopher A Leber
    Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, California 92093, United States.
  • Louis-FĂ©lix Nothias
    Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States.
  • Raphael Reher
    Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States.
  • Kyo Bin Kang
    Research Institute of Pharmaceutical Sciences, College of Pharmacy, Sookmyung Women's University, Seoul 04310, Korea.
  • Justin J J van der Hooft
    Bioinformatics Group, Wageningen University, Wageningen, The Netherlands. justin.vanderhooft@wur.nl.
  • Pieter C Dorrestein
    Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093; Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, CA 92037 pdorrestein@ucsd.edu.
  • William H Gerwick
    Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, La Jolla, California, 92037, United States of America. wgerwick@ucsd.edu.
  • Garrison W Cottrell
    Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USAgary@ucsd.eduhttp://cseweb.ucsd.edu/~gary/.