LoMuS: Low-Rank Adaptation with Multimodal Representations Improves Protein Stability Prediction

Journal: bioRxiv
Published Date:

Abstract

Protein folding stability is a key determinant for understanding protein dynamics, including molecular function, pathogenicity, and/or protein engineering. Yet, accurate prediction of protein stability changes remains a challenging problem due to the high-variability in the available data, especially from sequence-only information when structural knowledge is of low-resolution or unavailable. In this work, we introduce LoMuS, a Multimodal deep learning model that combines two complimentary aspects of the molecule and predicts unnormalized protein Stability effect from the primary sequence as input. In the core of the model architecture, a fusion network integrates explicit physicochemical descriptors with Low-rank adapted protein language model derived embeddings from the sequence that shows powerful and accurate generalization ability across various benchmark settings for predicting protein folding stability changes. We compared and rigorously evaluated our model capacity spanning from fold-induced stability changes to mutation caused stability effect prediction. This includes benchmarking against various held-out protein domains, out-of-distribution label settings and per-protein evaluation. LoMuS consistently outperforms other sequence-only protein stability baselines. It achieves an absolute performance gain by an at least 10% in the spearman rank correlation metric for predicting protein stability across many held-out domains and out-of-distribution stability label predictions. Per-protein validation additionally demonstrates promising performance gain of our model. Ablation analysis on the model architectural choices confirms that complementary signals from derived features are critical for this multimodal approach. We believe LoMuS advances protein engineering research and can aid in rational protein design by elucidating precise protein stability changes. All codes including data preparation scripts, training and validation recipes, and experimental configurations for LoMuS are available at: https://github.com/samuelinfantee/LoMuS-repository. Supplementary data are available at Journal Name online.

Authors

  • Samuel Infante; Akash Singh; Anowarul Kabir