MedEBench: Revisiting Text-instructed Image Editing on Medical Domain
Journal:
arXiv
Published Date:
Jun 2, 2025
Abstract
Text-guided image editing has seen rapid progress in natural image domains,
but its adaptation to medical imaging remains limited and lacks standardized
evaluation. Clinically, such editing holds promise for simulating surgical
outcomes, creating personalized teaching materials, and enhancing patient
communication. To bridge this gap, we introduce MedEBench, a comprehensive
benchmark for evaluating text-guided medical image editing. It consists of
1,182 clinically sourced image-prompt triplets spanning 70 tasks across 13
anatomical regions. MedEBench offers three key contributions: (1) a clinically
relevant evaluation framework covering Editing Accuracy, Contextual
Preservation, and Visual Quality, supported by detailed descriptions of
expected change and ROI (Region of Interest) masks; (2) a systematic comparison
of seven state-of-the-art models, revealing common failure patterns; and (3) a
failure analysis protocol based on attention grounding, using IoU between
attention maps and ROIs to identify mislocalization. MedEBench provides a solid
foundation for developing and evaluating reliable, clinically meaningful
medical image editing systems. Project website:
https://mliuby.github.io/MedEBench_Website/