Machine Learning Deciphered Molecular Mechanistics with Accurate Kinetic and Thermodynamic Prediction.

Journal: Journal of chemical theory and computation
PMID:

Abstract

Time-lagged independent component analysis (tICA) and the Markov state model (MSM) have been extensively employed for extracting conformational dynamics and kinetic community networks from unbiased trajectory ensembles. However, these techniques may not be the optimal choice for elucidating transition mechanisms within low-dimensional representations, especially for intricate biosystems. Unraveling the association mechanism in such complex systems always necessitates permutations of several essential independent components or collective variables, a process that is inherently obscure and may require empirical knowledge for selection. To address these challenges, we have implemented an integrated unsupervised dimension reduction model: uniform manifold approximation and projection (UMAP) with hierarchy density-based spatial clustering of applications with noise (HDBSCAN). This approach effectively generates low-dimensional configurational embeddings. The hierarchical application of this architecture, in conjunction with MSM, reveals global kinetic connectivity while identifying local conformational states. Consequently, our methodology establishes a multiscale mechanistic elucidation framework. Leveraging the benefits of the uniform sample distribution and a denoising approach, our model demonstrates robustness in preserving global and local data structures compared to traditional dimension reduction methods in the field of MD analysis area. The interpretability of hyperparameter selection and compatibility with downstream tasks are cross-validated across various simulation data sets, utilizing both computational evaluation metrics and experimental kinetic observables. Furthermore, the predicted Mcl1-BH3 association kinetics (0.76 s) is in close agreement with surface plasmon resonance experiments (0.12 s), affirming the plausibility of the identified pathway composed of representative conformations. We anticipate that the devised workflow will serve as a foundational framework for studying recognition patterns in complex biological systems. Its contributions extend to the exploration of protein functional dynamics and rational drug design, offering a potent avenue for advancing research in these domains.

Authors

  • Junlin Dong
    University of Chinese Academy of Sciences, Beijing 100049, China.
  • Shiyu Wang
    Research Center for Computer-Aided Drug Discovery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
  • Wenqiang Cui
    Shenzhen Institute of Advanced Technology Chinese Academy of Sciences, Shenzhen, China.
  • Xiaolin Sun
    Department of Transfusion Medicine, The First Medical Center of Chinese PLA General Hospital, Beijing, China.
  • Haojie Guo
    Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
  • Hailu Yan
    School of Biological Sciences, College of Science and Engineering, University of Edinburgh, Edinburgh EH8 9YL, U.K.
  • Horst Vogel
    AlphaMol Science Ltd, CH-4123 Allschwil, Switzerland; Institute of Chemical Science and Engineering (ISIC), Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland.
  • Zhi Wang
    Department of Pharmacy, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China.
  • Shuguang Yuan
    Research Center for Computer-Aided Drug Discovery, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; AlphaMol Science Ltd, CH-4123 Allschwil, Switzerland; Institute of Chemical Science and Engineering (ISIC), Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland. Electronic address: shuguang.yuan@gmail.com.