Geometric deep learning of protein-DNA binding specificity.

Journal: Nature methods
Published Date:

Abstract

Predicting protein-DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein-DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein-DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology.

Authors

  • Raktim Mitra
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
  • Jinsen Li
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
  • Jared M Sagendorf
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
  • Yibei Jiang
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
  • Ari S Cohen
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
  • Tsu-Pei Chiu
    Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
  • Cameron J Glasscock
    Department of Biochemistry, University of Washington, Seattle, WA, USA.
  • Remo Rohs
    Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA.