Deep learning for optimization of protein expression.

Journal: Current opinion in biotechnology

PMID: 37087839

Abstract

Advances in high-throughput DNA synthesis and sequencing have fuelled the use of massively parallel reporter assays for strain characterization. These experiments produce large datasets that map DNA sequences to protein expression levels, and have sparked increased interest in data-driven methods for sequence-to-expression modeling. Here, we highlight progress in deep learning models of protein expression and their potential for optimizing strains engineered to produce recombinant proteins. We discuss recent works that built highly accurate models as well as the challenges that hinder wider adoption by end users. There is a need to better align this technology with the requirements and capabilities encountered in strain engineering, particularly the cost of data acquisition and the need for interpretable models that generalize beyond the training data. Overcoming these barriers will help to incentivize academic and industrial laboratories to tap into a new era of data-centric strain engineering.

Authors

Evangelos-Marios Nikolados

School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JH, UK.
Diego A Oyarzún

School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JH, UK. d.oyarzun@ed.ac.uk.

Keywords

Bioengineering Deep Learning Proteins Recombinant Proteins

External Resources

View on PubMed Access via DOI PubMed (37087839)

Deep learning for optimization of protein expression.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals