Deep learning in single-cell and spatial transcriptomics data analysis: advances and challenges from a data science perspective.

Journal: Briefings in bioinformatics
PMID:

Abstract

The development of single-cell and spatial transcriptomics has revolutionized our capacity to investigate cellular properties, functions, and interactions in both cellular and spatial contexts. Despite this progress, the analysis of single-cell and spatial omics data remains challenging. First, single-cell sequencing data are high-dimensional and sparse, and are often contaminated by noise and uncertainty, obscuring the underlying biological signal. Second, these data often encompass multiple modalities, including gene expression, epigenetic modifications, metabolite levels, and spatial locations. Integrating these diverse data modalities is crucial for enhancing prediction accuracy and biological interpretability. Third, while the scale of single-cell sequencing has expanded to millions of cells, high-quality annotated datasets are still limited. Fourth, the complex correlations of biological tissues make it difficult to accurately reconstruct cellular states and spatial contexts. Traditional feature engineering approaches struggle with the complexity of biological networks, while deep learning, with its ability to handle high-dimensional data and automatically identify meaningful patterns, has shown great promise in overcoming these challenges. Besides systematically reviewing the strengths and weaknesses of advanced deep learning methods, we have curated 21 datasets from nine benchmarks to evaluate the performance of 58 computational methods. Our analysis reveals that model performance can vary significantly across different benchmark datasets and evaluation metrics, providing a useful perspective for selecting the most appropriate approach based on a specific application scenario. We highlight three key areas for future development, offering valuable insights into how deep learning can be effectively applied to transcriptomic data analysis in biological, medical, and clinical settings.

Authors

  • Shuang Ge
    Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, China.
  • Shuqing Sun
    Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, China.
  • Huan Xu
    School of Food Science and Engineering, Hainan University 58 Renmin Avenue Haikou 570228 China zhangzeling@hainanu.edu.cn benchao312@hainanu.edu.cn xuhuan.hnu@foxmail.com qichen@hainanu.edu.cn sunzhichang11@163.com hmcao@hainanu.edu.cn.
  • Qiang Cheng
    Department of Urology, Chinese People's Liberation Army General Hospital, Beijing, 100039 China.
  • Zhixiang Ren
    Peng Cheng Laboratory, Shenzhen, 518055, Guangdong Province, China. Electronic address: renzhx@pcl.ac.cn.