Clustering of single-cell multi-omics data with a multimodal deep learning method.

Journal: Nature communications
Published Date:

Abstract

Single-cell multimodal sequencing technologies are developed to simultaneously profile different modalities of data in the same cell. It provides a unique opportunity to jointly analyze multimodal data at the single-cell level for the identification of distinct cell types. A correct clustering result is essential for the downstream complex biological functional studies. However, combining different data sources for clustering analysis of single-cell multimodal data remains a statistical and computational challenge. Here, we develop a novel multimodal deep learning method, scMDC, for single-cell multi-omics data clustering analysis. scMDC is an end-to-end deep model that explicitly characterizes different data sources and jointly learns latent features of deep embedding for clustering analysis. Extensive simulation and real-data experiments reveal that scMDC outperforms existing single-cell single-modal and multimodal clustering methods on different single-cell multimodal datasets. The linear scalability of running time makes scMDC a promising method for analyzing large multimodal datasets.

Authors

  • Xiang Lin
    Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
  • Tian Tian
    Laboratory Animal Center College of Animal Science Jilin University Changchun China.
  • Zhi Wei
    Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA, zhiwei@njit.edu.
  • Hakon Hakonarson
    The Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.