scMUSCL: multi-source transfer learning for clustering scRNA-seq data.

Journal: Bioinformatics (Oxford, England)
PMID:

Abstract

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) analysis relies heavily on effective clustering to facilitate numerous downstream applications. Although several machine learning methods have been developed to enhance single-cell clustering, most are fully unsupervised and overlook the rich repository of annotated datasets available from previous single-cell experiments. Since cells are inherently high-dimensional entities, unsupervised clustering can often result in clusters that lack biological relevance. Leveraging annotated scRNA-seq datasets as a reference can significantly enhance clustering performance, enabling the identification of biologically meaningful clusters in target datasets.

Authors

  • Arash Khoeini
    School of Computing Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada.
  • Funda Sar
    Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada.
  • Yen-Yi Lin
    Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada.
  • Colin Collins
    Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada.
  • Martin Ester
    School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.