Accelerated enzyme engineering by machine-learning guided cell-free expression.

Journal: Nature communications
Published Date:

Abstract

Enzyme engineering is limited by the challenge of rapidly generating and using large datasets of sequence-function relationships for predictive design. To address this challenge, we develop a machine learning (ML)-guided platform that integrates cell-free DNA assembly, cell-free gene expression, and functional assays to rapidly map fitness landscapes across protein sequence space and optimize enzymes for multiple, distinct chemical reactions. We apply this platform to engineer amide synthetases by evaluating substrate preference for 1217 enzyme variants in 10,953 unique reactions. We use these data to build augmented ridge regression ML models for predicting amide synthetase variants capable of making 9 small molecule pharmaceuticals. Over these nine compounds, ML-predicted enzyme variants demonstrate 1.6- to 42-fold improved activity relative to the parent. Our ML-guided, cell-free framework promises to accelerate enzyme engineering by enabling iterative exploration of protein sequence space to build specialized biocatalysts in parallel.

Authors

  • Grant M Landwehr
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
  • Jonathan W Bogart
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
  • Carol Magalhaes
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
  • Eric G Hammarlund
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
  • Ashty S Karim
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA. ashty.karim@northwestern.edu.
  • Michael C Jewett
    Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA. mjewett@stanford.edu.