AI-Driven MRI-based Brain Tumour Segmentation Benchmarking
Journal:
arXiv
Published Date:
Jun 25, 2025
Abstract
Medical image segmentation has greatly aided medical diagnosis, with U-Net
based architectures and nnU-Net providing state-of-the-art performance. There
have been numerous general promptable models and medical variations introduced
in recent years, but there is currently a lack of evaluation and comparison of
these models across a variety of prompt qualities on a common medical dataset.
This research uses Segment Anything Model (SAM), Segment Anything Model 2 (SAM
2), MedSAM, SAM-Med-3D, and nnU-Net to obtain zero-shot inference on the BraTS
2023 adult glioma and pediatrics dataset across multiple prompt qualities for
both points and bounding boxes. Several of these models exhibit promising Dice
scores, particularly SAM and SAM 2 achieving scores of up to 0.894 and 0.893,
respectively when given extremely accurate bounding box prompts which exceeds
nnU-Net's segmentation performance. However, nnU-Net remains the dominant
medical image segmentation network due to the impracticality of providing
highly accurate prompts to the models. The model and prompt evaluation, as well
as the comparison, are extended through fine-tuning SAM, SAM 2, MedSAM, and
SAM-Med-3D on the pediatrics dataset. The improvements in point prompt
performance after fine-tuning are substantial and show promise for future
investigation, but are unable to achieve better segmentation than bounding
boxes or nnU-Net.