Large-scale design and refinement of stable proteins using sequence-only models.

Journal: PloS one
Published Date:

Abstract

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.

Authors

  • Jedediah M Singer
    Two Six Technologies, Arlington, Virginia, United States of America.
  • Scott Novotney
    Two Six Technologies, Arlington, Virginia, United States of America.
  • Devin Strickland
    Department of Electrical and Computer Engineering, University of Washington, Seattle, Washington, United States of America.
  • Hugh K Haddox
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, Washington, United States of America.
  • Nicholas Leiby
    Two Six Technologies, Arlington, Virginia, United States of America.
  • Gabriel J Rocklin
    Department of Pharmacology and Center for Synthetic Biology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America.
  • Cameron M Chow
    Department of Biochemistry, University of Washington, Seattle, WA, USA.
  • Anindya Roy
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, Washington, United States of America.
  • Asim K Bera
    Department of Biochemistry, University of Washington, Seattle, WA, USA.
  • Francis C Motta
    Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, Florida, United States of America.
  • Longxing Cao
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, Washington, United States of America.
  • Eva-Maria Strauch
    Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, Georgia, United States of America.
  • Tamuka M Chidyausiku
    Department of Biochemistry, University of Washington, Seattle, WA, USA.
  • Alex Ford
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, Washington, United States of America.
  • Ethan Ho
    Texas Advanced Computing Center, Austin, Texas, United States of America.
  • Alexander Zaitzeff
    Two Six Technologies, Arlington, Virginia, United States of America.
  • Craig O Mackenzie
    Quantitative Biomedical Sciences Graduate Program, Dartmouth College, Hanover, New Hampshire, United States of America.
  • Hamed Eramian
    Netrias LLC, 3100 Clarendon Boulevard, Suite 200, Arlington, Virginia 22201, United States.
  • Frank DiMaio
    Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
  • Gevorg Grigoryan
    Departments of Computer Science and Biological Sciences, Dartmouth College, Hanover, New Hampshire, United States of America.
  • Matthew Vaughn
    Texas Advanced Computing Center, Austin, Texas, United States of America.
  • Lance J Stewart
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, Washington, United States of America.
  • David Baker
    Department of Biochemistry, University of Washington, Seattle, Washington.
  • Eric Klavins
    Department of Electrical and Computer Engineering, University of Washington, Seattle, Washington, United States of America.