Predicting epistasis across proteins by structural logic.
Journal:
Proceedings of the National Academy of Sciences of the United States of America
Published Date:
Jan 16, 2026
Abstract
Accurately predicting the phenotypic consequences of genetic variation is a major challenge for precision medicine. The problem is exacerbated by epistatic interactions, nonadditive effects between genetic variants that produce unexpected phenotypes. Here, we explore an understudied form of positive epistasis: intragenic complementation, in which pairs of loss-of-function variants restore near wild-type protein function. Using mutational scanning in yeast, we identify thousands of such interactions in a clinically important enzyme, human argininosuccinate lyase (ASL). Restoration of protein function is not due to the biochemical properties of the substituted amino acids, but rather to a structural feature of the protein, the active site assembly. We develop a machine learning algorithm that uses protein language model embeddings to predict intragenic complementation in ASL with 99.6% accuracy. Additionally, the model trained on ASL generalizes to a structurally related but sequence-divergent enzyme, fumarase, with accuracy over 90%. Our findings reveal a structural basis for this form of epistasis and provide a predictive framework that could extend to at least 4% of human proteins.