Multi-scale interaction and locally enhanced bridging network for medical image segmentation.

Journal: Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society
Published Date:

Abstract

Accurate organ segmentation is crucial for precise medical diagnosis. Recent methods in CNNs and Transformers have significantly enhanced automatic medical image segmentation. Their encoders and decoders often rely on simple skip connections, which fail to effectively integrate multi-scale features. This causes a misalignment between low-resolution global features and high-resolution spatial information. As a result, segmentation accuracy suffers, particularly in global contours and local details. To address this limitation, MILENet, a multi-scale interaction and locally enhanced bridging network, is proposed. The proposed context bridge incorporates a multi-scale interaction module to reorganize multi-scale features and ensure global correlation. Additionally, a local enhancement module is introduced. It includes a dilated coordinate attention mechanism and a locally enhanced FFN built with a cascaded convolutional structure. This module enhances local context modeling and improves feature discrimination. Furthermore, a source-driven connection mechanism is introduced to preserve detailed information across layers, providing richer features for decoder reconstruction. By leveraging these innovations, MILENet effectively aligns multi-scale features and enhances local details, thereby improving segmentation accuracy. MILENet has been evaluated on publicly available datasets spanning abdominal CT (Synapse), cardiac MRI (ACDC), and colonoscopy RGB images (Kvasir, CVC-ClinicDB, CVC-ColonDB, CVC-300, and ETIS-LaribDB). The results show that MILENet achieves state-of-the-art performance across different modalities. It effectively handles both large-organ segmentation in CT/MRI and fine-grained polyp delineation in endoscopic images, demonstrating strong generalizability to diverse anatomical structures and imaging conditions. The code has been released on GitHub: https://github.com/syzhou1226/MILENET.

Authors

  • Zhiyong Huang
    Department of Computer Science, NUS School of Computing, National University of Singapore, Singapore.
  • Shiyao Zhou
    Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Shushan District, Hefei, 230022, Anhui, China.
  • Zhi Yu
    ModiFace - A L'Oréal Group Company, Toronto, ON, Canada.
  • Mingyang Hou
    School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.
  • Zhiyu Zhao
    School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China. Electronic address: zzy1259983302@163.com.
  • Xiaoyu Li
    Department of Gastroenterology, The Affiliated Hospital of Qingdao University, Qingdao, China.
  • Jiahong Wang
    Materials Artificial Intelligence Center, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen, 518055, P.R. China.
  • Yan Yan
    Department of Biomedical Engineering, Wayne State University, Detroit, Michigan, USA.
  • Yushi Liu
    Department of Ophthalmology, Peking University Third Hospital, Beijing, People's Republic of China; Beijing Key Laboratory of Restoration of Damaged Ocular Nerve, Peking University Third Hospital, People's Republic of China.
  • Hans Gregersen
    California Medical Innovations Institute, San Diego 92121, California.