Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification
Journal:
arXiv
Published Date:
Feb 4, 2025
Abstract
Recent advancements in foundation models have transformed computer vision,
driving significant performance improvements across diverse domains, including
digital histopathology. However, the advantages of domain-specific
histopathology foundation models over general-purpose models for specialized
tasks such as cell analysis remain underexplored. This study investigates the
representation learning gap between these two categories by analyzing
multi-level patch embeddings applied to cell instance segmentation and
classification. We implement an encoder-decoder architecture with a consistent
decoder and various encoders. These include convolutional, vision transformer
(ViT), and hybrid encoders pre-trained on ImageNet-22K or LVD-142M,
representing general-purpose foundation models. These are compared against ViT
encoders from the recently released UNI, Virchow2, and Prov-GigaPath foundation
models, trained on patches extracted from hundreds of thousands of
histopathology whole-slide images. The decoder integrates patch embeddings from
different encoder depths via skip connections to generate semantic and distance
maps. These maps are then post-processed to create instance segmentation masks
where each label corresponds to an individual cell and to perform cell-type
classification. All encoders remain frozen during training to assess their
pre-trained feature extraction capabilities. Using the PanNuke and CoNIC
histopathology datasets, and the newly introduced Nissl-stained CytoDArk0
dataset for brain cytoarchitecture studies, we evaluate instance-level
detection, segmentation accuracy, and cell-type classification. This study
provides insights into the comparative strengths and limitations of
general-purpose vs. histopathology foundation models, offering guidance for
model selection in cell-focused histopathology and brain cytoarchitecture
analysis workflows.