scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Single-cell RNA-seq (scRNA-seq) has been widely used to resolve cellular heterogeneity. After collecting scRNA-seq data, the natural next step is to integrate the accumulated data to achieve a common ontology of cell types and states. Thus, an effective and efficient cell-type identification method is urgently needed. Meanwhile, high-quality reference data remain a necessity for precise annotation. However, such tailored reference data are always lacking in practice. To address this, we aggregated multiple datasets into a meta-dataset on which annotation is conducted. Existing supervised or semi-supervised annotation methods suffer from batch effects caused by different sequencing platforms, the effect of which increases in severity with multiple reference datasets.

Authors

  • Musu Yuan
    School of Mathematical Sciences, Peking University, Beijing 100871, China.
  • Liang Chen
    Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China.
  • Minghua Deng
    Center for Quantitative Biology, Peking University, Beijing, China. dengmh@pku.edu.cn.