A robust and interpretable end-to-end deep learning model for cytometry data.

Journal: Proceedings of the National Academy of Sciences of the United States of America
PMID:

Abstract

Cytometry technologies are essential tools for immunology research, providing high-throughput measurements of the immune cells at the single-cell level. Existing approaches in interpreting and using cytometry measurements include manual or automated gating to identify cell subsets from the cytometry data, providing highly intuitive results but may lead to significant information loss, in that additional details in measured or correlated cell signals might be missed. In this study, we propose and test a deep convolutional neural network for analyzing cytometry data in an end-to-end fashion, allowing a direct association between raw cytometry data and the clinical outcome of interest. Using nine large cytometry by time-of-flight mass spectrometry or mass cytometry (CyTOF) studies from the open-access ImmPort database, we demonstrated that the deep convolutional neural network model can accurately diagnose the latent cytomegalovirus (CMV) in healthy individuals, even when using highly heterogeneous data from different studies. In addition, we developed a permutation-based method for interpreting the deep convolutional neural network model. We were able to identify a CD27- CD94+ CD8+ T cell population significantly associated with latent CMV infection, confirming the findings in previous studies. Finally, we provide a tutorial for creating, training, and interpreting the tailored deep learning model for cytometry data using Keras and TensorFlow (https://github.com/hzc363/DeepLearningCyTOF).

Authors

  • Zicheng Hu
    Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158 zicheng.hu@ucsf.edu atul.butte@ucsf.edu.
  • Alice Tang
    Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158.
  • Jaiveer Singh
    Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158.
  • Sanchita Bhattacharya
    Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158.
  • Atul J Butte
    Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA.