Computational scoring and experimental evaluation of enzymes generated by neural networks.

Journal: Nature biotechnology

Published Date: Apr 23, 2024

Abstract

In recent years, generative protein sequence models have been developed to sample novel sequences. However, predicting whether generated proteins will fold and function remains challenging. We evaluate a set of 20 diverse computational metrics to assess the quality of enzyme sequences produced by three contrasting generative models: ancestral sequence reconstruction, a generative adversarial network and a protein language model. Focusing on two enzyme families, we expressed and purified over 500 natural and generated sequences with 70-90% identity to the most similar natural sequences to benchmark computational metrics for predicting in vitro enzyme activity. Over three rounds of experiments, we developed a computational filter that improved the rate of experimental success by 50-150%. The proposed metrics and models will drive protein engineering research by serving as a benchmark for generative protein sequence models and helping to select active variants for experimental testing.

Authors

Sean R Johnson

New England Biolabs, Ipswich, MA, USA.
Xiaozhi Fu

Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
Sandra Viknander

Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
Clara Goldin

Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
Sarah Monaco
Aleksej Zelezniak

Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden. aleksej.zelezniak@chalmers.se.
Kevin K Yang

Division of Chemistry and Chemical Engineering; California Institute of Technology; Pasadena, California; United States of America.

Keywords

Algorithms Amino Acid Sequence Computational Biology Enzymes Neural Networks, Computer Protein Engineering

External Resources

View on PubMed Access via DOI PubMed (38653796)

Computational scoring and experimental evaluation of enzymes generated by neural networks.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals