Machine learning in biological physics: From biomolecular prediction to design.

Journal: Proceedings of the National Academy of Sciences of the United States of America
PMID:

Abstract

Machine learning has been proposed as an alternative to theoretical modeling when dealing with complex problems in biological physics. However, in this perspective, we argue that a more successful approach is a proper combination of these two methodologies. We discuss how ideas coming from physical modeling neuronal processing led to early formulations of computational neural networks, e.g., Hopfield networks. We then show how modern learning approaches like Potts models, Boltzmann machines, and the transformer architecture are related to each other, specifically, through a shared energy representation. We summarize recent efforts to establish these connections and provide examples on how each of these formulations integrating physical modeling and machine learning have been successful in tackling recent problems in biomolecular structure, dynamics, function, evolution, and design. Instances include protein structure prediction; improvement in computational complexity and accuracy of molecular dynamics simulations; better inference of the effects of mutations in proteins leading to improved evolutionary modeling and finally how machine learning is revolutionizing protein engineering and design. Going beyond naturally existing protein sequences, a connection to protein design is discussed where synthetic sequences are able to fold to naturally occurring motifs driven by a model rooted in physical principles. We show that this model is "learnable" and propose its future use in the generation of unique sequences that can fold into a target structure.

Authors

  • Jonathan Martin
    Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080.
  • Marcos Lequerica Mateos
    BCMaterials, Basque Center for Materials, Applications and Nanostructures, Universidad del País Vasco/Euskal Herriko Unibertsitatea Science Park, Leioa 48940, Spain.
  • José N Onuchic
    Center for Theoretical Biological Physics, Rice University, Houston, TX 77005; michele.dipierro@rice.edu jonuchic@rice.edu.
  • Ivan Coluzza
    BCMaterials, Basque Center for Materials, Applications and Nanostructures, Universidad del País Vasco/Euskal Herriko Unibertsitatea Science Park, Leioa 48940, Spain.
  • Faruck Morcos
    Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080.