Deep Learning Framework for Integrating Multibatch Calibration, Classification, and Pathway Activities.

Journal: Analytical chemistry
Published Date:

Abstract

The amount of available biological data has exploded since the emergence of high-throughput technologies, which is not only revolting the way we recognize molecules and diseases but also bringing novel analytical challenges to bioinformatics analysis. In recent years, deep learning has become a dominant technique in data science. However, classification accuracy is plagued with domain discrepancy. Notably, in the presence of multiple batches, domain discrepancy typically happens between individual batches. Most pairwise adaptation approaches may be suboptimal as they fail to eliminate external factors across multiple batches and take the classification task into account simultaneously. We propose a joint deep learning framework for integrating batch effect removal, classification, and downstream pathway activities upon biological data. To this end, we validate it on two MALDI MS-based metabolomics datasets. We have achieved the highest diagnostic accuracy (ACC), with a notable ∼10% improvement over other methods. Overall, these results indicate that our approach removes batch effect more effectively than state-of-the-art methods and yields more accurate classification as well as biomarkers for smart diagnosis.

Authors

  • JingYang Niu
    School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China.
  • Wei Xu
    College of Food and Bioengineering, Henan University of Science and Technology, Luoyang, 471023 China.
  • DongMing Wei
    School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China.
  • Kun Qian
    Key Laboratory of Brain Health Intelligent Evaluation and Intervention (Beijing Institute of Technology), Ministry of Education, Beijing, China.
  • Qian Wang
    Department of Radiation Oncology, China-Japan Union Hospital of Jilin University, Changchun, China.