MLCV: Bridging Machine-Learning-Based Dimensionality Reduction and Free-Energy Calculation.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Importance-sampling algorithms leaning on the definition of a model reaction coordinate (RC) are widely employed to probe processes relevant to chemistry and biology alike, spanning time scales not amenable to common, brute-force molecular dynamics (MD) simulations. In practice, the model RC often consists of a handful of collective variables (CVs) chosen on the basis of chemical intuition. However, constructing manually a low-dimensional RC model to describe an intricate geometrical transformation for the purpose of free-energy calculations and analyses remains a daunting challenge due to the inherent complexity of the conformational transitions at play. To solve this issue, remarkable progress has been made in employing machine-learning techniques, such as autoencoders, to extract the low-dimensional RC model from a large set of CVs. Implementation of the differentiable, nonlinear machine-learned CVs in common MD engines to perform free-energy calculations is, however, particularly cumbersome. To address this issue, we present here a user-friendly tool (called MLCV) that facilitates the use of machine-learned CVs in importance-sampling simulations through the popular Colvars module. Our approach is critically probed with three case examples consisting of small peptides, showcasing that through hard-coded neural network in Colvars, deep-learning and enhanced-sampling can be effectively bridged with MD simulations. The MLCV code is versatile, applicable to all the CVs available in Colvars, and can be connected to any kind of dense neural networks. We believe that MLCV provides an effective, powerful, and user-friendly platform accessible to experts and nonexperts alike for machine-learning (ML)-guided CV discovery and enhanced-sampling simulations to unveil the molecular mechanisms underlying complex biochemical processes.

Authors

  • Haochuan Chen
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Han Liu
    Shenzhen Key Laboratory of Photonic Devices and Sensing Systems for Internet of Things, Guangdong and Hong Kong Joint Research Centre for Optical Fibre Sensors, State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen University, Shenzhen 518060, China.
  • Heying Feng
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Haohao Fu
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Wensheng Cai
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Xueguang Shao
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Christophe Chipot
    Laboratoire International Associé CNRS and University of Illinois at Urbana-Champaign, UMR no. 7019, Université de Lorraine, BP 70239, F-54506 Vandœuvre-lès-Nancy, France.