Comprehensive single-cell RNA-seq analysis using deep interpretable generative modeling guided by biological hierarchy knowledge.

Journal: Briefings in bioinformatics
PMID:

Abstract

Recent advances in microfluidics and sequencing technologies allow researchers to explore cellular heterogeneity at single-cell resolution. In recent years, deep learning frameworks, such as generative models, have brought great changes to the analysis of transcriptomic data. Nevertheless, relying on the potential space of these generative models alone is insufficient to generate biological explanations. In addition, most of the previous work based on generative models is limited to shallow neural networks with one to three layers of latent variables, which may limit the capabilities of the models. Here, we propose a deep interpretable generative model called d-scIGM for single-cell data analysis. d-scIGM combines sawtooth connectivity techniques and residual networks, thereby constructing a deep generative framework. In addition, d-scIGM incorporates hierarchical prior knowledge of biological domains to enhance the interpretability of the model. We show that d-scIGM achieves excellent performance in a variety of fundamental tasks, including clustering, visualization, and pseudo-temporal inference. Through topic pathway studies, we found that d-scIGM-learned topics are better enriched for biologically meaningful pathways compared to the baseline models. Furthermore, the analysis of drug response data shows that d-scIGM can capture drug response patterns in large-scale experiments, which provides a promising way to elucidate the underlying biological mechanisms. Lastly, in the melanoma dataset, d-scIGM accurately identified different cell types and revealed multiple melanin-related driver genes and key pathways, which are critical for understanding disease mechanisms and drug development.

Authors

  • Hegang Chen
    School of Computer Science and Engineering, Sun Yat-sen University, 132 Waihuan East Road, Guangzhou University Town, 510006, Guangzhou, China.
  • Yuyin Lu
    School of Computer Science and Engineering, Sun Yat-sen University, 132 Waihuan East Road, Guangzhou University Town, 510006, Guangzhou, China.
  • Zhiming Dai
    School of Data and Computer Science.
  • Yuedong Yang
    Institute for Glycomics and School of Information and Communication Technique, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.
  • Qing Li
    Department of Internal Medicine, University of Michigan Ann Arbor, MI 48109, USA.
  • Yanghui Rao
    School of Computer Science and Engineering, Sun Yat-sen University, 132 Waihuan East Road, Guangzhou University Town, 510006, Guangzhou, China.