Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SE-models) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.

Authors

  • Naoki Watanabe
    Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Osaka, 566-0002, Japan, 81 8093069457.
  • Masahiro Murata
    Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan.
  • Teppei Ogawa
    Mitsui Knowledge Industry Co., Ltd. (MKI), 2-3-33 Nakanoshima, Kita-ku, Osaka 530-0005, Japan.
  • Christopher J Vavricka
    Graduate School of Science, Technology and Innovation, Kobe University, Kobe, Japan.
  • Akihiko Kondo
    Graduate School of Science, Technology and Innovation, Kobe University, Kobe, Japan; Engineering Biology Research Center, Kobe University, Kobe, Japan; Department of Chemical Science and Engineering, Graduate School of Engineering, Kobe University, Kobe, Japan. Electronic address: akondo@kobe-u.ac.jp.
  • Chiaki Ogino
    Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501 Japan.
  • Michihiro Araki
    Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Osaka, 566-0002, Japan, 81 8093069457.