CatPred: a comprehensive framework for deep learning in vitro enzyme kinetic parameters.

Journal: Nature communications
PMID:

Abstract

Estimation of enzymatic activities still heavily relies on experimental assays, which can be cost and time-intensive. We present CatPred, a deep learning framework for predicting in vitro enzyme kinetic parameters, including turnover numbers (k), Michaelis constants (K), and inhibition constants (K). CatPred addresses key challenges such as the lack of standardized datasets, performance evaluation on enzyme sequences that are dissimilar to those used during training, and model uncertainty quantification. We explore diverse learning architectures and feature representations, including pretrained protein language models and three-dimensional structural features, to enable robust predictions. CatPred provides accurate predictions with query-specific uncertainty estimates, with lower predicted variances correlating with higher accuracy. Pretrained protein language model features particularly enhance performance on out-of-distribution samples. CatPred also introduces benchmark datasets with extensive coverage (~23 k, 41 k, and 12 k data points for k, K, and K respectively). Our framework performs competitively with existing methods while offering reliable uncertainty quantification.

Authors

  • Veda Sheersh Boorla
    Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802.
  • Costas D Maranas
    Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, The Pennsylvania State University, University Park, PA, USA. Electronic address: costas@psu.edu.