In Silico Digital Breast Tomosynthesis Dataset for the Comparative Analysis of Deep Learning Models in Tumor Segmentation.
Journal:
Journal of imaging informatics in medicine
Published Date:
Aug 4, 2025
Abstract
The scarcity of publicly available digital breast tomosynthesis (DBT) datasets significantly limits the development of robust deep learning (DL) models for breast tumor segmentation. In this exploratory proof-of-concept study, we assess the viability of in silico-generated DBT data as a training source for tumor segmentation. A dataset of 230 two-dimensional (2D) regions of interest (ROIs) derived from FDA-cleared software and encompassing a spectrum of breast densities and tumor complexities, was used to train 13 DL models, including U-Net, FCN, DeepLabv3, and DeepLabv3 + architectures. Each model was trained either from scratch or fine-tuned using COCO-pretrained weights (ResNet50/101 backbones). Performance was evaluated using F1-score, intersection over union (IoU), precision, and recall. Among all models, U-Net trained from scratch and DeepLabv3 + fine-tuned with ResNet50 achieved the highest and most consistent results (F1-scores of 82.52% and 84.98%, and per-image IoUs of 78.49% and 83.77%, respectively). No statistically significant differences were found using the Wilcoxon signed-rank test and post hoc Bonferroni correction (α > 0.0042). To evaluate generalization across domains, the baseline U-Net model was retrained from scratch on a hybrid dataset combining in silico and real-world DBT ROIs, yielding promising results (F1-score of 79%). Despite the domain shift, these findings support the utility of in silico DBT as a complementary resource for training and benchmarking DL models, particularly in data-limited environments. This study provides foundational experimental evidence for integrating computationally generated in silico data into AI-based DBT tumor segmentation research workflows.
Authors
Keywords
No keywords available for this article.