NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data.

Journal: Analytical chemistry
Published Date:

Abstract

Untargeted metabolomics based on liquid chromatography-mass spectrometry is affected by nonlinear batch effects, which cover up biological effects, result in nonreproducibility, and are difficult to be calibrate. In this study, we propose a novel deep learning model, called Normalization Autoencoder (NormAE), which is based on nonlinear autoencoders (AEs) and adversarial learning. An additional classifier and ranker are trained to provide adversarial regularization during the training of the AE model, latent representations are extracted by the encoder, and then the decoder reconstructs the data without batch effects. The NormAE method was tested on two real metabolomics data sets. After calibration by NormAE, the quality control samples (QCs) for both data sets gathered most closely in a PCA score plot (average distances decreased from 56.550 and 52.476 to 7.383 and 14.075, respectively) and obtained the highest average correlation coefficients (from 0.873 and 0.907 to 0.997 for both). Additionally, NormAE significantly improved biomarker discovery (median number of differential peaks increased from 322 and 466 to 1140 and 1622, respectively). NormAE was compared with four commonly used batch effect removal methods. The results demonstrated that using NormAE produces the best calibration results.

Authors

  • Zhiwei Rong
    Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China.
  • Qilong Tan
    Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China.
  • Lei Cao
    State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, Liaoning, People's Republic of China. Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, Liaoning, People's Republic of China. University of Chinese Academy of Sciences, Beijing, People's Republic of China.
  • Liuchao Zhang
    Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China.
  • Kui Deng
    National Center for Birth Defect Monitoring of China, Department of Pediatrics, West China Second University Hospital, Sichuan University, Chengdu, China.
  • Yue Huang
    Xiamen University, Xiamen, Fujian 361005, China.
  • Zheng-Jiang Zhu
    Interdisciplinary Research Center on Biology and Chemistry, and Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, PR China. Electronic address: jiangzhu@sioc.ac.cn.
  • Zhenzi Li
    Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China.
  • Kang Li
    Department of Otolaryngology, Longgang Otolaryngology hospital & Shenzhen Key Laboratory of Otolaryngology, Shenzhen Institute of Otolaryngology, Shenzhen, Guangdong, China.