HVUNet: A hybrid vision transformer-based UNet for accurate detection and localization in histopathology images.

Journal: Computers in biology and medicine

Published Date: Jul 15, 2025

Abstract

Precise identification of object of interest (OoI) in histopathology images plays a vital role in cancer diagnosis and prognosis. Despite advances in digital pathology, detecting specific cellular structures within these images remains a significant challenge due to the inherent complexity and variability in cell morphology. Cellular structures exhibit similar visual characteristics, such as colors, shapes, and textures, making them difficult to distinguish from one another. Certain OoIs are much smaller than surrounding cells, rendering manual detection both challenging and error-prone. This paper introduces a hybrid vision transformer-based UNet (HVUNet) model, a novel approach designed to effectively identify and localize OoIs in histopathology images. To improve the detection in histopathology images, the proposed model incorporates UNet with vision transformers (ViTs) within an advanced encoder-decoder architecture. We evaluate HVUNet using the GZMH dataset, which includes histopathology images annotated for mitosis detection and the Lymphocyte detection (LD) dataset for lymphocyte cell detection. Through comprehensive experiments, we demonstrate that HVUNet notably surpasses several state-of-the-art models, including CNN variants, ViT-based models, and hybrid CNN-ViT architectures. Experimental results show that HVUNet outperforms traditional models such as UNet and recent advancements like UNETR and AttentionUNet, with a precision of 0.94, a recall of 0.60, and a F1-score of 0.72 for the GZMH dataset. Furthermore, HVUNet attained an Intersection over Union (IoU) score of 0.76 and a mean Average Precision (mAP) of 0.81, emphasizing its effectiveness in detecting mitotic cells. The model also achieved a F1-score of 0.76, an IoU of 0.63, and a mAP of 0.75, for the lymphocyte detection dataset demonstrating its effectiveness in detecting lymphocyte cells. To evaluate generalizability, we tested HVUNet on the MIDOG 2021 and PanopTILs datasets, observing competitive performance that demonstrated its robustness and broad applicability across diverse histopathology image analysis tasks.

Authors

Anusree Kanadath

Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai International Academic City, 345055, Dubai, United Arab Emirates.
Angel Arul Jothi J

Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai International Academic City, 345055, Dubai, United Arab Emirates. Electronic address: angeljothi@dubai.bits-pilani.ac.in.
Siddhaling Urolagin

Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden. siddhaling@dubai.bits-pilani.ac.in.

Keywords

Algorithms Databases, Factual Humans Image Interpretation, Computer-Assisted Image Processing, Computer-Assisted

External Resources

View on PubMed Access via DOI PubMed (40669286)

HVUNet: A hybrid vision transformer-based UNet for accurate detection and localization in histopathology images.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

HVUNet: A hybrid vision transformer-based UNet for accurate detection and localization in histopathology images.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals