UTR-Insight: integrating deep learning for efficient 5' UTR discovery and design.
Journal:
BMC genomics
Published Date:
Feb 4, 2025
Abstract
The 5' UTR is critical for mRNA stability and translation efficiency in therapeutics. We developed UTR-Insight, a model integrating a pretrained language model with a CNN-Transformer architecture, explaining 89.1% of the mean ribosome load (MRL) variation in random 5' UTRs and 82.8% in endogenous 5' UTRs, surpassing existing models. Using UTR-Insight, we performed high-throughput in silico screening of hundreds of thousands of endogenous 5' UTRs from primates, mice, and viruses. The screened sequences increased protein expression by up to 319% compared to the human α-globin 5' UTR, and UTR-Insight-designed sequences achieved even greater expression levels than high-performing endogenous 5' UTRs.