Deep learning-based method for automatic resolution of gas chromatography-mass spectrometry data from complex samples.

Journal: Journal of chromatography. A
PMID:

Abstract

Modern gas chromatography-mass spectrometry (GC-MS) is the workhorse for the high-throughput profiling of volatile compounds in complex samples. It can produce a considerable amount of two-dimensional data, and automatic methods are required to distill chemical information from raw GC-MS data efficiently. In this study, we proposed an Automatic Resolution method (AutoRes) based on pseudo-Siamese convolutional neural networks (pSCNN) to extract the meaningful features swamped by the noises, baseline drifts, retention time shifts, and overlapped peaks. Two pSCNN models were trained with 400,000 augmented spectral pairs, respectively. They can predict the selective region (pSCNN1) and elution region (pSCNN2) of compounds in an untargeted manner. The accuracies of the pSCNN1 model and the pSCNN2 model on their test sets are 99.9% and 92.6%, respectively. Then, the chromatographic profile of each component was automatically resolved by full rank resolution (FRR) based on the predicted regions by these models. The performance of AutoRes was evaluated on the simulated and plant essential oil datasets. Compared to AMDIS and MZmine, AutoRes resolves more reasonable mass spectra, chromatograms, and peak areas to identify and quantify compounds. The average match scores of AutoRes (925 and 936) outperformed AMDIS (909 and 925) and MZmine (888 and 916) when resolving mass spectra from overlapped peaks on the Set Ⅰ and Set Ⅱ of plant essential oil dataset and matching them against the NIST17 library. It extracted peak areas and mass spectra automatically from 10 GC-MS files of plant essential oils, and the entire process was completed in 8 min without any prior information or manual intervention. It is implemented in Python and is available as an open-source package at https://github.com/dyjfan/AutoRes.

Authors

  • Yingjie Fan
    College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China.
  • Chuanxiu Yu
    College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, Hunan, China.
  • Hongmei Lu
    College of Chemistry and Chemical Engineering, Central South University, Changsha, People's Republic of China.
  • Yi Chen
    Department of Anesthesiology and Perioperative Medicine, General Hospital of Ningxia Medical University, Yinchuan, China.
  • Binbin Hu
    Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, Yunnan, China.
  • Xingren Zhang
    Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, Yunnan, China; Baoshan City Branch of Yunnan Tobacco Company, Baoshan 678000, Yunnan, China.
  • Jiaen Su
    Dali Prefecture Branch of Yunnan Tobacco Company, Dali 671000, Yunnan, China. Electronic address: wwdzxcl@126.com.
  • Zhimin Zhang
    School of Control Science and Engineering, Shandong University, Jinan, People's Republic of China. School of Information Technology and Electrical Engineering, University of Queensland, Queensland, Australia.