RNAcentral in 2026: Genes and literature integration

Journal: bioRxiv
Published Date:

Abstract

RNAcentral was founded in 2014 to serve as a comprehensive database of non-coding RNA sequences. It began by providing a single unified interface to more specialised resources, and now contains 45 million sequences. It has grown beyond providing a single interface to many specialised resources and now provides several services and analyses. These include secondary structure prediction with R2DT, sequence search, and analysis with Rfam. Since its last publication in 2021, RNAcentral has developed two major features. First, literature integration with the development of LitScan and LitSumm. LitScan automatically identifies and links relevant publications to RNA entries, while LitSumm uses natural language processing to generate functional summaries from the literature. Together, these tools address the critical challenge of connecting sequence data with scattered functional knowledge across thousands of publications. Secondly, RNAcentral has created gene level entries. Gene level entries represent a large structural change to RNAcentral. While RNAcentral previously organized data exclusively at the sequence level, we now group related transcripts into gene-centric views. This allows researchers to explore all isoforms, splice variants, and related sequences for a gene in a unified interface, better reflecting biological organization and facilitating comparative analyses. RNAcentral is freely available at: https://rnacentral.org.

Authors

  • Andrew Green; Carlos Eduardo Ribas; Isaac Jandalala; Philippa Muston; Colman O’Cathail; Guy Cochrane; Christina Ernst; Lingyun Zhao; Pedro Madrigal; Helen Attrill; Steven Marygold; Doron Lancet; Niv Dobzinski; Patricia P. Chan; Todd M. Lowe; Elspeth A. Bruford; Ruth L. Seal; Henning Hermjakob; Kalpana Panneerselvam; Robert D. Finn; Tatiana A. Gurbich; Sam Griffiths-Jones; Bastian Fromm; Kevin J. Peterson; Dominik Sordyl; Janusz M. Bujnicki; Sameer Velankar; Sri Devan Appasamy; Sudakshina Ganguly; Peng Zhang; Shunmin He; Kim M. Rutherford; Valerie Wood; Ruth C. Lovering; Ernesto Picardi; Nancy Ontiveros; Lin Huang; Zhichao Miao; Anton S. Petrov; Holly McCann; Emanuele Cavalleri; Marco Mesiti; Elena Rivas; Marcell Szikszai; Marcin Magnus; Jan Gerken; Maria Chuvochina; Danny Bergeron; Michelle Scott; Kelly Williams; Robin R Gutell; Cheong Xin Chan; Mark Quinton-Tulloch; Stavros Diamantakis; Anton I. Petrov; Alex Bateman; Blake A. Sweeney