A robust and scalable graph neural network for accurate single-cell classification.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Single-cell RNA sequencing (scRNA-seq) techniques provide high-resolution data on cellular heterogeneity in diverse tissues, and a critical step for the data analysis is cell type identification. Traditional methods usually cluster the cells and manually identify cell clusters through marker genes, which is time-consuming and subjective. With the launch of several large-scale single-cell projects, millions of sequenced cells have been annotated and it is promising to transfer labels from the annotated datasets to newly generated datasets. One powerful way for the transferring is to learn cell relations through the graph neural network (GNN), but traditional GNNs are difficult to process millions of cells due to the expensive costs of the message-passing procedure at each training epoch. Here, we have developed a robust and scalable GNN-based method for accurate single-cell classification (GraphCS), where the graph is constructed to connect similar cells within and between labelled and unlabeled scRNA-seq datasets for propagation of shared information. To overcome the slow information propagation of GNN at each training epoch, the diffused information is pre-calculated via the approximate Generalized PageRank algorithm, enabling sublinear complexity over cell numbers. Compared with existing methods, GraphCS demonstrates better performance on simulated, cross-platform, cross-species and cross-omics scRNA-seq datasets. More importantly, our model provides a high speed and scalability on large datasets, and can achieve superior performance for 1 million cells within 50 min.

Authors

  • Yuansong Zeng
    School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.
  • Zhuoyi Wei
    School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.
  • Zixiang Pan
    School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510000, China.
  • Yutong Lu
    School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.
  • Yuedong Yang
    Institute for Glycomics and School of Information and Communication Technique, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.