ABCoRT: Retention Time Prediction for Metabolite Identification via Atom-Bond Co-Learning.

Journal: Journal of chemical information and modeling
PMID:

Abstract

Liquid chromatography retention time (RT) prediction plays a crucial role in metabolite identification, a challenging and essential task in untargeted metabolomics. Accurate molecular representation is vital for reliable RT prediction. To address this, we propose a novel molecular representation learning framework, ABCoRT(tom-ond -learning for etention ime prediction), designed for predicting metabolite retention times. Our model transforms molecular graphs into dual hypergraphs, enabling the collaborative updating of atomic and bond information within both molecular graphs and hypergraphs, thereby producing highly informative molecular representations. We evaluated ABCoRT on a large-scale Small Molecule Retention Time (SMRT) data set comprising 80,038 molecules. Our model achieved a mean absolute error (MAE) of 25.75 s and a mean relative error (MRE) of 3.24% after removing nonretained molecules. Additionally, we fine-tuned pretrained ABCoRT models on six additional data sets from PredRet, achieving the lowest MAEs on five of them. Additionally, in metabolite screening conducted on the MetaboBASE and RIKEN_PlaSM data sets from the MassBank of North America, ABCoRT demonstrates its capability to filter out 38.35 and 28.46% of candidate compounds, respectively.

Authors

  • Guangbin Cheng
    School of Information Science and Engineering, Yunnan University, Kunming650091,China.
  • Bingyi Wang
    School of Basic Medical Sciences, Lanzhou University, Lanzhou, China.
  • Nannan Bai
    Yunnan Police College, Kunming650223, China.
  • Weihua Li
    State Key Laboratory of Molecular Engineering of Polymers, Key Laboratory of Computational Physical Sciences, Department of Macromolecular Science, Fudan University, Shanghai 200438, China.