Generalizable and Scalable Visualization of Single-Cell Data Using Neural Networks.

Journal: Cell systems
Published Date:

Abstract

Visualization algorithms are fundamental tools for interpreting single-cell data. However, standard methods, such as t-stochastic neighbor embedding (t-SNE), are not scalable to datasets with millions of cells and the resulting visualizations cannot be generalized to analyze new datasets. Here we introduce net-SNE, a generalizable visualization approach that trains a neural network to learn a mapping function from high-dimensional single-cell gene-expression profiles to a low-dimensional visualization. We benchmark net-SNE on 13 different datasets, and show that it achieves visualization quality and clustering accuracy comparable with t-SNE. Additionally we show that the mapping function learned by net-SNE can accurately position entire new subtypes of cells from previously unseen datasets and can also be used to reduce the runtime of visualizing 1.3 million cells by 36-fold (from 1.5 days to an hour). Our work provides a framework for bootstrapping single-cell analysis from existing datasets.

Authors

  • Hyunghoon Cho
    Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA and Department of Mathematics, MIT, Cambridge, MA, USA.
  • Bonnie Berger
    Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA and Department of Mathematics, MIT, Cambridge, MA, USA Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA and Department of Mathematics, MIT, Cambridge, MA, USA.
  • Jian Peng
    Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA.