From Histopathology Images to Cell Clouds: Learning Slide Representations with Hierarchical Cell Transformer
Journal:
arXiv
Published Date:
Dec 21, 2024
Abstract
It is clinically crucial and potentially very beneficial to be able to
analyze and model directly the spatial distributions of cells in histopathology
whole slide images (WSI). However, most existing WSI datasets lack cell-level
annotations, owing to the extremely high cost over giga-pixel images. Thus, it
remains an open question whether deep learning models can directly and
effectively analyze WSIs from the semantic aspect of cell distributions. In
this work, we construct a large-scale WSI dataset with more than 5 billion
cell-level annotations, termed WSI-Cell5B, and a novel hierarchical Cell Cloud
Transformer (CCFormer) to tackle these challenges. WSI-Cell5B is based on 6,998
WSIs of 11 cancers from The Cancer Genome Atlas Program, and all WSIs are
annotated per cell by coordinates and types. To the best of our knowledge,
WSI-Cell5B is the first WSI-level large-scale dataset integrating cell-level
annotations. On the other hand, CCFormer formulates the collection of cells in
each WSI as a cell cloud and models cell spatial distribution. Specifically,
Neighboring Information Embedding (NIE) is proposed to characterize the
distribution of cells within the neighborhood of each cell, and a novel
Hierarchical Spatial Perception (HSP) module is proposed to learn the spatial
relationship among cells in a bottom-up manner. The clinical analysis indicates
that WSI-Cell5B can be used to design clinical evaluation metrics based on
counting cells that effectively assess the survival risk of patients. Extensive
experiments on survival prediction and cancer staging show that learning from
cell spatial distribution alone can already achieve state-of-the-art (SOTA)
performance, i.e., CCFormer strongly outperforms other competing methods.