Template-based RNA structure prediction advanced through a blind code competition

Journal: bioRxiv
Published Date:

Abstract

Automatically predicting RNA 3D structure from sequence remains an unsolved challenge in biology and biotechnology. Here, we describe a Kaggle code competition engaging over 1700 teams and 43 previously unreleased structures to tackle this challenge. The top three submitted algorithms achieved scores within statistical error of the winners of the recent CASP16 competition. Unexpectedly, the top Kaggle strategy involved a pipeline for discovering 3D templates, without the use of deep learning. We integrated this template-modeling pipeline and other Kaggle strategies to develop a single model RNAPro that retrospectively outperformed individual Kaggle models on the same test set. These results suggest a growing importance of template-based modeling in RNA structure prediction.

Authors

  • Youhan Lee; Shujun He; Toshiyuki Oda; G. John Rao; Yehyun Kim; Raehyun Kim; Hyunjin Kim; Cher Keng Heng; Danny Kowerko; Haowei Li; Hoa Nguyen; Arunodhayan Sampathkumar; Raúl Enrique Gómez; Meng Chen; Atsushi Yoshizawa; Shun Kuraishi; Kenji Ogawa; Shuxian Zou; Alejo Paullier; Bingkang Zhao; Huey-Long Chen; Tsu-An Hsu; Tatsuya Hirano; Wah Chiu; Jeanine G. Gezelle; Daniel Haack; Yibao Hong; Shekhar Jadhav; Deepak Koirala; Rachael C Kretsch; Anna Lewicka; Shanshan Li; Marco Marcia; Joseph Piccirilli; Boris Rudolfs; Yoshita Srivastava; Anna-Lena Steckelberg; Zhaoming Su; Navtej Toor; Liu Wang; Zi Yang; Kaiming Zhang; Jian Zou; David Baker; Shi-Jie Chen; Maggie Demkin; Andrew Favor; Alissa M Hummer; Chaitanya K. Joshi; Andriy Kryshtafovych; Emine Küçükbenli; Zhichao Miao; John Moult; Christian Munley; Walter Reade; Theo Viel; Eric Westhof; Sicheng Zhang; Rhiju Das