Accurate RNA 3D structure prediction using a language model-based deep learning approach.

Journal: Nature methods
PMID:

Abstract

Accurate prediction of RNA three-dimensional (3D) structures remains an unsolved challenge. Determining RNA 3D structures is crucial for understanding their functions and informing RNA-targeting drug development and synthetic biology design. The structural flexibility of RNA, which leads to the scarcity of experimentally determined data, complicates computational prediction efforts. Here we present RhoFold+, an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences. By integrating an RNA language model pretrained on ~23.7 million RNA sequences and leveraging techniques to address data scarcity, RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction. Retrospective evaluations on RNA-Puzzles and CASP15 natural RNA targets demonstrate the superiority of RhoFold+ over existing methods, including human expert groups. Its efficacy and generalizability are further validated through cross-family and cross-type assessments, as well as time-censored benchmarks. Additionally, RhoFold+ predicts RNA secondary structures and interhelical angles, providing empirically verifiable features that broaden its applicability to RNA structure and function studies.

Authors

  • Tao Shen
    Shanghai Chenpon Pharmaceutical Co., Ltd., Shanghai, China.
  • Zhihang Hu
  • Siqi Sun
    Toyota Technological Institute at Chicago, Chicago, IL 60615, USA.
  • Di Liu
    Laboratory of Nutrition and Functional Food, College of Food Science and Engineering, Jilin University, Changchun, China.
  • Felix Wong
    Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Jiuming Wang
    Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, Hong Kong SAR 999077, China.
  • Jiayang Chen
    Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
  • Yixuan Wang
    Department of Cardiovascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Liang Hong
    Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, Hong Kong SAR 999077, China.
  • Jin Xiao
    Sichuan University, China.
  • Liangzhen Zheng
    Tencent AI Lab, Shenzhen, China.
  • Tejas Krishnamoorthi
    School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA.
  • Irwin King
    Shenzhen Key Laboratory of Rich Media Big Data Analytics and Application, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong; Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong.
  • Sheng Wang
    Intensive Care Medical Center, Tongji Hospital, School of Medicine, Tongji University, Shanghai, 200065, People's Republic of China.
  • Peng Yin
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, Massachusetts 02115, United States.
  • James J Collins
    Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Yu Li
    Department of Public Health, Shihezi University School of Medicine, 832000, China.