Advancing Volumetric Medical Image Segmentation via Global-Local Masked Autoencoders.
Journal:
IEEE transactions on medical imaging
Published Date:
May 14, 2025
Abstract
Masked Autoencoder (MAE) is a self-supervised pre-training technique that holds promise in improving the representation learning of neural networks. However, the current application of MAE directly to volumetric medical images poses two challenges: (i) insufficient global information for clinical context understanding of the holistic data, and (ii) the absence of any assurance of stabilizing the representations learned from randomly masked inputs. To conquer these limitations, we propose the Global-Local Masked AutoEncoders (GL-MAE), a simple yet effective selfsupervised pre-training strategy. GL-MAE acquires robust anatomical structure features by incorporating multilevel reconstruction from fine-grained local details to high-level global semantics. Furthermore, a complete global view serves as an anchor to direct anatomical semantic alignment and stabilize the learning process through global-to-global consistency learning and global-to-local consistency learning. Our fine-tuning results on eight mainstream public datasets demonstrate the superiority of our method over other state-of-the-art self-supervised algorithms, highlighting its effectiveness on versatile volumetric medical image segmentation and classification tasks.We will release codes upon acceptance at https://github.com/JiaxinZhuang/GL-MAE.
Authors
Keywords
No keywords available for this article.