Predicting single-cell gene expression profiles of imaging flow cytometry data with machine learning.
Journal:
Nucleic acids research
PMID:
33119742
Abstract
High-content imaging and single-cell genomics are two of the most prominent high-throughput technologies for studying cellular properties and functions at scale. Recent studies have demonstrated that information in large imaging datasets can be used to estimate gene mutations and to predict the cell-cycle state and the cellular decision making directly from cellular morphology. Thus, high-throughput imaging methodologies, such as imaging flow cytometry can potentially aim beyond simple sorting of cell-populations. We introduce IFC-seq, a machine learning methodology for predicting the expression profile of every cell in an imaging flow cytometry experiment. Since it is to-date unfeasible to observe single-cell gene expression and morphology in flow, we integrate uncoupled imaging data with an independent transcriptomics dataset by leveraging common surface markers. We demonstrate that IFC-seq successfully models gene expression of a moderate number of key gene-markers for two independent imaging flow cytometry datasets: (i) human blood mononuclear cells and (ii) mouse myeloid progenitor cells. In the case of mouse myeloid progenitor cells IFC-seq can predict gene expression directly from brightfield images in a label-free manner, using a convolutional neural network. The proposed method promises to add gene expression information to existing and new imaging flow cytometry datasets, at no additional cost.
Authors
Keywords
Animals
Computational Biology
Databases, Genetic
Female
Fetal Blood
Flow Cytometry
Gene Expression Profiling
Gene Expression Regulation
High-Throughput Nucleotide Sequencing
Humans
Image Processing, Computer-Assisted
Machine Learning
Male
Mice
Mice, Inbred C57BL
Monocytes
Myeloid Progenitor Cells
Neural Networks, Computer
Single-Cell Analysis
Transcriptome