It has been known for decades that codon usage contributes to translation efficiency and hence to protein production levels. However, its role in protein synthesis is still only partly understood. This lack of understanding hampers the design of synt...
The functional impact of single nucleotide polymorphisms (SNPs) on translation has yet to be considered when prioritizing disease-causing SNPs from genome-wide association studies (GWAS). Here we apply machine learning models to genome-wide ribosome ...
Bioactive peptides are key molecules in health and medicine. Deep learning holds a big promise for the discovery and design of bioactive peptides. Yet, suitable experimental approaches are required to validate candidates in high throughput and at low...
Multiplexed assays of variant effect are powerful methods to profile the consequences of rare variants on gene expression and organismal fitness. Yet, few studies have integrated several multiplexed assays to map variant effects on gene expression in...
INTRODUCTION: Gene identification for genetic diseases is critical for the development of new diagnostic approaches and personalized treatment options. Prioritization of gene translation is an important consideration in the molecular biology field, a...
mRNA therapeutics are revolutionizing the pharmaceutical industry, but methods to optimize the primary sequence for increased expression are still lacking. Here, we design 5'UTRs for efficient mRNA translation using deep learning. We perform polysome...
Model-guided DNA sequence design can accelerate the reprogramming of living cells. It allows us to engineer more complex biological systems by removing the need to physically assemble and test each potential design. While mechanistic models of gene e...
BACKGROUND: The design of nucleotide sequences with defined properties is a long-standing problem in bioengineering. An important application is protein expression, be it in the context of research or the production of mRNA vaccines. The rate of prot...
mRNA degradation is a central process that affects all gene expression levels, though it remains challenging to predict the stability of a mRNA from its sequence, due to the many coupled interactions that control degradation rate. Here, we carried ou...
Causal inference-assisted machine learning was used to predict photosynthetic bacterial (PSB) protein production capacity and identify key factors. The extreme gradient boosting algorithm effectively predicted protein content, while the gradient boos...