Transfer learning for cross-context prediction of protein expression from 5'UTR sequence.

Journal: Nucleic acids research

Published Date: Jul 22, 2024

Abstract

Model-guided DNA sequence design can accelerate the reprogramming of living cells. It allows us to engineer more complex biological systems by removing the need to physically assemble and test each potential design. While mechanistic models of gene expression have seen some success in supporting this goal, data-centric, deep learning-based approaches often provide more accurate predictions. This accuracy, however, comes at a cost - a lack of generalization across genetic and experimental contexts that has limited their wider use outside the context in which they were trained. Here, we address this issue by demonstrating how a simple transfer learning procedure can effectively tune a pre-trained deep learning model to predict protein translation rate from 5' untranslated region (5'UTR) sequence for diverse contexts in Escherichia coli using a small number of new measurements. This allows for important model features learnt from expensive massively parallel reporter assays to be easily transferred to new settings. By releasing our trained deep learning model and complementary calibration procedure, this study acts as a starting point for continually refined model-based sequence design that builds on previous knowledge and future experimental efforts.

Authors

Pierre-Aurélien Gilliot

School of Biological Sciences, University of Bristol, Life Sciences Building Tyndall Avenue, Bristol BS8 1TQ, UK.
Thomas E Gorochowski

School of Biological Sciences, University of Bristol, Life Sciences Building Tyndall Avenue, Bristol BS8 1TQ, UK; BrisSynBio, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, UK. Electronic address: thomas.gorochowski@bristol.ac.uk.

Keywords

5' Untranslated Regions Deep Learning Escherichia coli Protein Biosynthesis

External Resources

View on PubMed Access via DOI PubMed (38864396)

Transfer learning for cross-context prediction of protein expression from 5'UTR sequence.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals