DeepGSEA: explainable deep gene set enrichment analysis for single-cell transcriptomic data.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Gene set enrichment (GSE) analysis allows for an interpretation of gene expression through pre-defined gene set databases and is a critical step in understanding different phenotypes. With the rapid development of single-cell RNA sequencing (scRNA-seq) technology, GSE analysis can be performed on fine-grained gene expression data to gain a nuanced understanding of phenotypes of interest. However, with the cellular heterogeneity in single-cell gene profiles, current statistical GSE analysis methods sometimes fail to identify enriched gene sets. Meanwhile, deep learning has gained traction in applications like clustering and trajectory inference in single-cell studies due to its prowess in capturing complex data patterns. However, its use in GSE analysis remains limited, due to interpretability challenges.

Authors

  • Guangzhi Xiong
    Department of Computer Science, University of Virginia, Charlottesville, VA, 22904, United States.
  • Nathan J LeRoy
    Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22904, United States.
  • Stefan Bekiranov
    Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22903, USA.
  • Nathan C Sheffield
    Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22904, United States.
  • Aidong Zhang
    Department of Computer Science and Engineering, SUNY at Buffalo, Buffalo, USA.