Prediction of protein-ligand binding affinity from sequencing data with interpretable machine learning.

Journal: Nature biotechnology
Published Date:

Abstract

Protein-ligand interactions are increasingly profiled at high throughput using affinity selection and massively parallel sequencing. However, these assays do not provide the biophysical parameters that most rigorously quantify molecular interactions. Here we describe a flexible machine learning method, called ProBound, that accurately defines sequence recognition in terms of equilibrium binding constants or kinetic rates. This is achieved using a multi-layered maximum-likelihood framework that models both the molecular interactions and the data generation process. We show that ProBound quantifies transcription factor (TF) behavior with models that predict binding affinity over a range exceeding that of previous resources; captures the impact of DNA modifications and conformational flexibility of multi-TF complexes; and infers specificity directly from in vivo data such as ChIP-seq without peak calling. When coupled with an assay called K-seq, it determines the absolute affinity of protein-ligand interactions. We also apply ProBound to profile the kinetics of kinase-substrate interactions. ProBound opens new avenues for decoding biological networks and rationally engineering protein-ligand interactions.

Authors

  • H Tomas Rube
    Department of Bioengineering, University of California, Merced, Merced, CA, USA.
  • Chaitanya Rastogi
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Siqian Feng
    Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA.
  • Judith F Kribelbauer
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Allyson Li
    Department of Chemistry, Columbia University, New York, NY, USA.
  • Basheer Becerra
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Lucas A N Melo
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Bach Viet Do
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Xiaoting Li
    Department of Medicine, Leshan Vocational and Technical College, No. 1336, Middle Section of Qingyijiang Avenue, Shizhong District, Leshan City, Sichuan Province, China.
  • Hammaad H Adam
    Department of Biological Sciences, Columbia University, New York, NY, USA.
  • Neel H Shah
    Department of Chemistry, Columbia University, New York, NY, USA.
  • Richard S Mann
    Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA.
  • Harmen J Bussemaker
    Department of Biological Sciences, Columbia University, New York, NY, USA. hjb2004@columbia.edu.