Unlocking Your Programmable and Creative RNA Sequence Designer with RDiffusion

Journal: bioRxiv
Published Date:

Abstract

As a cornerstone of the central dogma, RNA has both witnessed and actively shaped three billion years of evolution. Over this vast timescale, a remarkable diversity of RNA molecules has emerged, executing functions that extend far beyond traditional roles in information transfer. In the post-genomic era, while we have cataloged tens of millions of non-coding RNA sequences and functionally annotated millions, this knowledge merely scratches the surface of the vast and enigmatic RNA sequence space.Here, we introduce RDiffusion, a comprehensive generative model designed to extensively explore this RNA universe. RDiffusion is a diffusion-based framework that, conditioned on diverse biological features, such as desired function, family type, secondary structure, tertiary structure, or binding proteins, can guide the generation of novel RNA sequences tailored to specific specifications. We evaluate RDiffusion across a broad spectrum of RNA design tasks and find that it not only surpasses all baseline methods in design success rate and sequence diversity but also achieves state-of-the-art performance on downstream tasks, functioning as a powerful RNA foundation model. To translate RDiffusion into disease applications, we targeted osteoarthritis (OA) as a prime paradigm, utilizing the RDiffusion to perform de novo design of novel miRNA sequences guided by a customized, data-driven seed selection and screening pipeline. While these designed candidates are currently undergoing rigorous biological experimental validations, the finalized evaluation data will be comprehensively integrated and presented upon formal publication. By providing a unified approach to RNA design, we anticipate that RDiffusion will accelerate the programmable engineering of RNA, with profound implications for human health, drug development, and gene-editing tools, while also establishing a new standard for representation learning on RNA-related downstream tasks.

Authors

  • Wang
  • J.; Dong
  • J.; Li
  • T.; Yang
  • L.; yin
  • J.; Chen
  • J.; Dong
  • Y.; Li
  • J.; Tan
  • C.

Categories