PreMode predicts mode-of-action of missense variants by deep graph representation learning of protein sequence and structural context.

Journal: Nature communications
Published Date:

Abstract

Accurate prediction of the functional impact of missense variants is important for disease gene discovery, clinical genetic diagnostics, therapeutic strategies, and protein engineering. Previous efforts have focused on predicting a binary pathogenicity classification, but the functional impact of missense variants is multi-dimensional. Pathogenic missense variants in the same gene may act through different modes of action (i.e., gain/loss-of-function) by affecting different aspects of protein function. They may result in distinct clinical conditions that require different treatments. We develop a new method, PreMode, to perform gene-specific mode-of-action predictions. PreMode models effects of coding sequence variants using SE(3)-equivariant graph neural networks on protein sequences and structures. Using the largest-to-date set of missense variants with known modes of action, we show that PreMode reaches state-of-the-art performance in multiple types of mode-of-action predictions by efficient transfer-learning. Additionally, PreMode's prediction of G/LoF variants in a kinase is consistent with inactive-active conformation transition energy changes. Finally, we show that PreMode enables efficient study design of deep mutational scans and can be expanded to fitness optimization of non-human proteins with active learning.

Authors

  • Guojie Zhong
    Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
  • Yige Zhao
    Department of Systems Biology, Columbia University, New York, NY, USA.
  • Demi Zhuang
    Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
  • Wendy K Chung
    6 Departments of Pediatrics and Medicine, Columbia University Medical Center, New York, NY, USA.
  • Yufeng Shen
    Department of Systems Biology, Columbia University, New York, NY, USA. ys2411@cumc.columbia.edu.