CQformer: Learning Dynamics Across Slices in Medical Image Segmentation.

Journal: IEEE transactions on medical imaging
Published Date:

Abstract

Prevalent studies on deep learning-based 3D medical image segmentation capture the continuous variation across 2D slices mainly via convolution, Transformer, inter-slice interaction, and time series models. In this work, via modeling this variation by an ordinary differential equation (ODE), we propose a cross instance query-guided Transformer architecture (CQformer) that leverages features from preceding slices to improve the segmentation performance of subsequent slices. Its key components include a cross-attention mechanism in an ODE formulation, which bridges the features of contiguous 2D slices of the 3D volumetric data. In addition, a regression head is employed to shorten the gap between the bottleneck and the prediction layer. Extensive experiments on 7 datasets with various modalities (CT, MRI) and tasks (organ, tissue, and lesion) demonstrate that CQformer outperforms previous state-of-the-art segmentation algorithms on 6 datasets by 0.44%-2.45%, and achieves the second highest performance of 88.30% on the BTCV dataset. The code is available at https://github.com/qbmizsj/CQformer.

Authors

  • Shengjie Zhang
    CAS Key Laboratory of Nutrition, Metabolism and Food Safety, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.
  • Xin Shen
  • Xiang Chen
    Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, China.
  • Ziqi Yu
    College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
  • Bohan Ren
  • Haibo Yang
    Academy of Psychology and Behavior, Tianjin Normal University, No. 57-1 Wujiayao Street, Hexi District, Tianjin 300074, China.
  • Xiao-Yong Zhang
  • Yuan Zhou
    Department of Pharmacy, Taihe Hospital, Hubei University of Medicine, Shiyan, China.