GraphKM: machine and deep learning for K prediction of wildtype and mutant enzymes.

Journal: BMC bioinformatics
Published Date:

Abstract

Michaelis constant (K) is one of essential parameters for enzymes kinetics in the fields of protein engineering, enzyme engineering, and synthetic biology. As overwhelming experimental measurements of K are difficult and time-consuming, prediction of the K values from machine and deep learning models would increase the pace of the enzymes kinetics studies. Existing machine and deep learning models are limited to the specific enzymes, i.e., a minority of enzymes or wildtype enzymes. Here, we used a deep learning framework PaddlePaddle to implement a machine and deep learning approach (GraphKM) for K prediction of wildtype and mutant enzymes. GraphKM is composed by graph neural networks (GNN), fully connected layers and gradient boosting framework. We represented the substrates through molecular graph and the enzymes through a pretrained transformer-based language model to construct the model inputs. We compared the difference of the model results made by the different GNN (GIN, GAT, GCN, and GAT-GCN). The GAT-GCN-based model generally outperformed. To evaluate the prediction performance of the GraphKM and other reported K prediction models, we collected an independent K dataset (HXKm) from literatures.

Authors

  • Xiao He
    Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland. xiao.he@bsse.ethz.ch.
  • Ming Yan