MS2MP: A Deep Learning Framework for Metabolic Pathway Prediction from MS/MS-Based Untargeted Metabolomics.

Journal: Analytical chemistry
Published Date:

Abstract

MS/MS-based untargeted metabolomics generates complex data, but pathway enrichment analysis is constrained by the low annotation rates of metabolic features. Here, we propose MS2MP, a novel deep learning-based framework for KEGG pathway prediction directly from untargeted tandem mass spectrometry (MS), eliminating the need for prior metabolite annotation. MS2MP utilizes a graph neural network architecture to learn the complex relationships between spectral features and metabolic pathways, representing MS spectra as fragmentation tree graphs. Trained on 33,221 experimental MS spectra, MS2MP achieves robust predictive performance with a balanced accuracy of 94.1% in cross-validation and 87.8%-91.2% on three independent test sets. Notably, MS2MP achieves an "exact match" for 97-98 out of 161 tested metabolite standards across diverse experimental conditions, underscoring its reliability and adaptability. Subsequently, a novel MS-based pathway enrichment method was developed. The established methods were applied to identify significantly perturbed pathways in transgenic maize. The results uncovered disruptions in phenylpropanoid biosynthesis and related downstream pathways, including those involved in amino acid and secondary metabolite metabolism, which were overlooked by the conventional annotation-based enrichment analysis method. To the best of our knowledge, MS2MP is the first computational tool capable of directly predicting metabolic pathways from MS spectra. By linking MS-based untargeted metabolomics data to metabolic pathways, MS2MP enables more efficient pathway enrichment analysis, thereby accelerating biological discoveries and enhancing our understanding of complex metabolic networks.

Authors

  • Han Bao
    Department of Computer Science, The University of Tokyo, Japan; Center for Advanced Intelligence Project, RIKEN, Japan. Electronic address: tsutsumi@ms.k.u-tokyo.ac.jp.
  • Xiuqiong Zhang
    State Key Laboratory of Medical Proteomics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, P. R. China.
  • Xinxin Wang
    School of Science, Tianjin University of Commerce, Tianjin 300134, China.
  • Jinhui Zhao
    Centre for Addictions Research of British Columbia, University of Victoria, Victoria, Canada.
  • Xinjie Zhao
    College of Humanities and Development Studies, China Agricultural University, Beijing 100083, PR China. Electronic address: sinketsuzao@foxmail.com.
  • Chunxia Zhao
    State Key Laboratory of Medical Proteomics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, P. R. China.
  • Xin Lu
    CAS Key Laboratory of Separation Sciences for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China.
  • Guowang Xu
    CAS Key Laboratory of Separation Sciences for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China.