FlashRNA: An Efficient Model for Regulatory Genomics

Journal: bioRxiv
Published Date:

Abstract

Transformer-based genomic sequence-to-function models effectively capture long-range genomic interactions but incur high computational costs due to the quadratic complexity of their self-attention layers. In this work, we introduce FlashRNA, which significantly improves computational and memory efficiency through FlashAttention, advancements in model architecture, and optimized training setup. FlashRNA achieves comparable or slightly improved predictive performance compared to similar sized Borzoi or Flashzoi models, notably without depending on pre-trained weights – a major limitation of Flashzoi. Remarkably, we trained FlashRNA from scratch in one day on a single GPU, significantly accelerating training and inference speed. These improvements can facilitate further developments in models for regulatory genomics by reducing computational cost. We demonstrate this in two downstream applications: 1) we train a large ensemble of 16 FlashRNA models and distill them into a single model to improve performance while maintaining efficiency, and 2) we fine-tune FlashRNA on three prediction tasks – ChIP-seq, RNA half-life, and translation efficiency – achieving performance matching or exceeding state-of-the-art task-specific models. Code: https://github.com/deepgenomics/flashrna

Authors

  • Andrew J. Jung; Helen Zhu; Alice J. Gao; Roujia Li; Mykhaylo Slobodyanyuk; Vivian Chu; Declan Lim; Leo J. Lee; Albi Celaj; Brendan J. Frey