AI foundation models for RNA biology.

Journal: RNA biology
Published Date:

Abstract

RNA biology is undergoing a transformative revolution driven by AI foundation models. These models learn the intricate relationships between RNA sequence, structure, and function by training on vast, diverse datasets spanning millions of RNA molecules across various species. Through self-supervised learning on these sequences, these models acquire a generalizable understanding of RNA, which can then be fine-tuned for various downstream tasks, thereby enabling the decoding of functional rules embedded in RNA sequences. In this review, we provide a comprehensive guide to RNA foundation models. Using concrete examples of RNA biology, we begin with the concept of foundation models and review the importance of pre-training datasets, architectural innovations, self-supervised strategies, and fine-tuning approaches that allow general RNA representations to be translated into task-specific models. Crucially, we highlight how explainable AI (XAI) methods transform these models from black-box predictors into valuable discovery tools that reveal candidate cis-regulatory elements and structural motifs. As RNA foundation models keep advancing and integrating more multimodal biological data, they aim to uncover additional regulatory rules and functions encoded in RNA.

Authors

Keywords

No keywords available for this article.