Deep learning to decompose macromolecules into independent Markovian domains.

Journal: Nature communications
Published Date:

Abstract

The increasing interest in modeling the dynamics of ever larger proteins has revealed a fundamental problem with models that describe the molecular system as being in a global configuration state. This notion limits our ability to gather sufficient statistics of state probabilities or state-to-state transitions because for large molecular systems the number of metastable states grows exponentially with size. In this manuscript, we approach this challenge by introducing a method that combines our recent progress on independent Markov decomposition (IMD) with VAMPnets, a deep learning approach to Markov modeling. We establish a training objective that quantifies how well a given decomposition of the molecular system into independent subdomains with Markovian dynamics approximates the overall dynamics. By constructing an end-to-end learning framework, the decomposition into such subdomains and their individual Markov state models are simultaneously learned, providing a data-efficient and easily interpretable summary of the complex system dynamics. While learning the dynamical coupling between Markovian subdomains is still an open issue, the present results are a significant step towards learning Ising models of large molecular complexes from simulation data.

Authors

  • Andreas Mardt
    Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195, Berlin, Germany.
  • Tim Hempel
    Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany.
  • Cecilia Clementi
    Center for Theoretical Biological Physics, and Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005, United States. Electronic address: cecilia@rice.edu.
  • Frank Noé
    Department of Mathematics and Computer Science , Freie Universität Berlin , Berlin , Germany.