A deep learning model trained on expressed transcripts across different tissue types reveals cell-type codon-optimization preferences.

Journal: Nucleic acids research

PMID: 40156867

Abstract

Species-specific differences in protein translation can affect the design of protein-based drugs. Consequently, efficient expression of recombinant proteins often requires codon optimization. Publicly available optimization tools do not always result in higher expression levels and can lead to protein misfolding and reduced expression. Here, we aimed to develop a novel deep learning (DL) tool using a recurrent neural network (RNN) to define cell type-dependent codon biases. Using gene expression data from three different tissue types (brain, liver, and muscle) and all secretory genes, we trained DL models to predict optimal codon usage. Codon-optimized sequences for test reporter genes exhibited enhanced protein expression compared to their original sequences and those optimized using a publicly available tool. Interestingly, DL models trained on genes expressed in liver cells (hepatocytes) resulted in the highest levels of expression when tested in vitro, irrespective of the cell type. Our findings also demonstrate that DL-based codon optimization algorithms can significantly enhance protein translation, particularly for secretory proteins, which are crucial for therapeutic applications. This research represents a novel approach to codon optimization with broader implications for protein-based pharmaceuticals, vaccine manufacturing, gene therapy, and other recombinant DNA products.

Authors

Sandhiya Ravi

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.
Tapan Sharma

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.
Mitchell Yip

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.
Huiya Yang

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.
Jun Xie

Information Technology Center, West China Hospital of Sichuan University, Chengdu, China.
Guangping Gao

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.
Phillip W L Tai

Department of Genetic and Cellular Medicine, UMass Chan Medical School, Worcester, MA 01605, United States.

Keywords

Animals Brain Codon Codon Usage Deep Learning Hepatocytes Humans Liver Neural Networks, Computer Organ Specificity Protein Biosynthesis Recombinant Proteins

External Resources

View on PubMed Access via DOI PubMed (40156867)

A deep learning model trained on expressed transcripts across different tissue types reveals cell-type codon-optimization preferences.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals