Weakly-supervised thyroid ultrasound segmentation: Leveraging multi-scale consistency, contextual features, and bounding box supervision for accurate target delineation.

Journal: Computers in biology and medicine
Published Date:

Abstract

Weakly-supervised learning (WSL) methods have gained significant attention in medical image segmentation, but they often face challenges in accurately delineating boundaries due to overfitting to weak annotations such as bounding boxes. This issue is particularly pronounced in thyroid ultrasound images, where low contrast and noisy backgrounds hinder precise segmentation. In this paper, we propose a novel weakly-supervised segmentation framework that addresses these challenges. Our framework integrates several key components: the Spatial Arrangement Consistency (SAC) branch, the Hierarchical Prediction Consistency (HPC) branch, the Contextual Feature Integration (CFI) branch, and the Multi-scale Prototype Refinement (MPR) module. These elements work together to enhance segmentation performance and mitigate overfitting to bounding box annotations. Specifically, the SAC branch ensures spatial alignment of the predicted segmentation with the target by evaluating maximum activations along both the horizontal and vertical dimensions of the bounding box. The HPC branch refines prototypes for target and background regions from semantic feature maps, comparing secondary predictions with the initial ones to improve segmentation accuracy. The CFI branch enhances feature representation by incorporating contextual information from neighboring regions, while the MPR module further refines segmentation accuracy by balancing global context and local details through multi-scale feature refinement. We evaluate the performance of our method on two thyroid ultrasound datasets: TG3K and TN3K, using comprehensive metrics including mIOU, DSC, HD95, DI, ACC, PR, and SE. On the TG3K dataset, the Proposed Method achieved mIOU of 71.85 %, DSC of 85.92 %, HD95 of 13.09 mm, and ACC of 0.93, significantly outperforming existing weakly-supervised methods. On the TN3K dataset, our model demonstrated mIOU of 70.45 %, DSC of 84.81 %, HD95 of 14.16 mm, and ACC of 0.91, further validating the robustness of the proposed method across datasets. In terms of Precision (PR) and Sensitivity (SE), the Proposed Method achieved PR = 0.91 and SE = 0.86 on the TG3K dataset, and PR = 0.89 and SE = 0.86 on the TN3K dataset. These results show that our model not only improves segmentation accuracy and boundary delineation (HD95) but also significantly reduces the dependency on pixel-level annotations, providing an effective solution for weakly-supervised thyroid ultrasound segmentation. Our method demonstrates competitive performance with fully-supervised approaches, with reduced annotation time, thereby improving the practicality of deep learning-based segmentation in clinical settings.

Authors

  • Mohammed Aly
    Department of Artificial Intelligence, Faculty of Artificial Intelligence, Egyptian Russian University, 11829, Badr City, Egypt. Electronic address: mohammed-alysalem@eru.edu.eg.