FlexRibbon: Joint Sequence and Structure Pretraining for Protein Modeling

Journal: bioRxiv
Published Date:

Abstract

Protein foundation models have advanced rapidly, with most approaches falling into two dominant paradigms. Sequence-only language models (e.g., ESM-2) capture sequence semantics at scale but lack structural grounding. MSA-based predictors (e.g., AlphaFold 2/3) achieve accurate folding by exploiting evolutionary couplings, but their reliance on homologous sequences makes them less reliable in highly mutated or alignment-sparse regimes. We present FlexRibbon, a pretrained protein model that jointly learns from amino acid sequences and three-dimensional structures. Our pretraining strategy combines masked language modeling with diffusion-based denoising, enabling bidirectional sequence-structure learning without requiring MSAs. Trained on both experimentally resolved structures and AlphaFold 2 predictions, FlexRibbon captures global folds as well as flexible conformations critical for biological function. Evaluated across diverse tasks spanning interface design, intermolecular interaction prediction, and protein function prediction, FlexRibbon establishes new state-of-the-art performance on 12 different tasks, with particularly strong gains in mutation-rich settings where MSA-based methods often struggle.

Authors

  • Jianwei Zhu; Yu Shi; Ran Bi; Peiran Jin; Chang Liu; Zhe Zhang; Haitao Huang; Zekun Guo; Pipi Hu; Fusong Ju; Lin Huang; Xinwei Tai; Chenao Li; Kaiyuan Gao; Xinran Wei; Huanhuan Xia; Jia Zhang; Yaosen Min; Zun Wang; Yusong Wang; Liang He; Haiguang Liu; Tao Qin