Comparative performance of one-stage and two-stage deep learning models for instance segmentation of overhanging dental restorations on bitewing radiographs.

Journal: Scientific reports
Published Date:

Abstract

Accurate detection of overhanging dental restorations on bitewing radiographs is clinically important but remains challenging due to subtle marginal discrepancies. This study aimed to develop and compare deep learning-based instance segmentation models for the automated detection and classification of overhanging restorations, with a specific focus on the relative performances of one-stage and two-stage architectures. A total of 1236 anonymized bitewing radiographs were retrospectively collected and manually annotated using polygon-based segmentation by specialist dentists. The restorations were classified as ideal or overhanging based on established radiographic criteria. One-stage YOLO11-based segmentation models (YOLO11n-, YOLO11s-, and YOLO11m-seg) were compared with two-stage architectures implemented in Detectron2, including the Mask R-CNN, Cascade Mask R-CNN, and PointRend. All models were trained and evaluated using identical train-validation-test splits. Performance was assessed on an independent test set using COCO-style instance segmentation metrics (AP50-95, AP50, AP75, and AR@100), reported both overall and for the clinically relevant overhang class. Uncertainty was quantified using non-parametric bootstrap resampling (B = 200) to derive the 95% confidence intervals. All models achieved high AP50 values, indicating effective coarse localization of the restorations. However, two-stage architectures showed consistently higher AP-based point estimates than the one-stage YOLO-based models at stricter overlap thresholds. Cascade Mask R-CNN achieved the highest overall performance (AP50-95 = 0.703), while PointRend yielded the highest overhang-specific AP-based segmentation performance (Overhang AP50-95 = 0.743). YOLO-based models demonstrated lower AP50-95 values despite relatively strong AP50 performance, reflecting a reduced boundary precision. Increasing the YOLO model capacity did not improve performance. Two-stage instance segmentation models, particularly those incorporating boundary-aware refinement, showed higher AP-based performance than one-stage approaches for detecting overhanging dental restorations on bitewing radiographs in this dataset. These findings highlight the importance of architectural design over model scale alone and suggest that two-stage frameworks may offer advantages for precise evaluation of restoration margins in a controlled experimental setting. However, external validation across centers and imaging systems is required before their suitability for routine clinical decision support applications can be established.

Authors

Keywords

No keywords available for this article.