Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images
Journal:
arXiv
Published Date:
Jan 26, 2025
Abstract
Spatial Transcriptomics (ST) allows a high-resolution measurement of RNA
sequence abundance by systematically connecting cell morphology depicted in
Hematoxylin and Eosin (H&E) stained histology images to spatially resolved gene
expressions. ST is a time-consuming, expensive yet powerful experimental
technique that provides new opportunities to understand cancer mechanisms at a
fine-grained molecular level, which is critical for uncovering new approaches
for disease diagnosis and treatments. Here, we present $\textbf{Stem}$
($\textbf{S}$pa$\textbf{T}$ially resolved gene $\textbf{E}$xpression inference
with diffusion $\textbf{M}$odel), a novel computational tool that leverages a
conditional diffusion generative model to enable in silico gene expression
inference from H&E stained images. Through better capturing the inherent
stochasticity and heterogeneity in ST data, $\textbf{Stem}$ achieves
state-of-the-art performance on spatial gene expression prediction and
generates biologically meaningful gene profiles for new H&E stained images at
test time. We evaluate the proposed algorithm on datasets with various tissue
sources and sequencing platforms, where it demonstrates clear improvement over
existing approaches. $\textbf{Stem}$ generates high-fidelity gene expression
predictions that share similar gene variation levels as ground truth data,
suggesting that our method preserves the underlying biological heterogeneity.
Our proposed pipeline opens up the possibility of analyzing existing, easily
accessible H&E stained histology images from a genomics point of view without
physically performing gene expression profiling and empowers potential
biological discovery from H&E stained histology images.