S2DA-Net: Spatial and spectral-learning double-branch aggregation network for liver tumor segmentation in CT images.

Journal: Computers in biology and medicine
Published Date:

Abstract

Accurate liver tumor segmentation is crucial for aiding radiologists in hepatocellular carcinoma evaluation and surgical planning. While convolutional neural networks (CNNs) have been successful in medical image segmentation, they face challenges in capturing long-term dependencies among pixels. On the other hand, Transformer-based models demand a high number of parameters and involve significant computational costs. To address these issues, we propose the Spatial and Spectral-learning Double-branched Aggregation Network (S2DA-Net) for liver tumor segmentation. S2DA-Net consists of a double-branched encoder and a decoder with a Group Multi-Head Cross-Attention Aggregation (GMCA) module, Two branches in the encoder consist of a Fourier Spectral-learning Multi-scale Fusion (FSMF) branch and a Multi-axis Aggregation Hadamard Attention (MAHA) branch. The FSMF branch employs a Fourier-based network to learn amplitude and phase information, capturing richer features and detailed information without introducing an excessive number of parameters. The FSMF branch utilizes a Fourier-based network to capture amplitude and phase information, enriching features without introducing excessive parameters. The MAHA branch incorporates spatial information, enhancing discriminative features while minimizing computational costs. In the decoding path, a GMCA module extracts local information and establishes long-term dependencies, improving localization capabilities by amalgamating features from diverse branches. Experimental results on the public LiTS2017 liver tumor datasets show that the proposed segmentation model achieves significant improvements compared to the state-of-the-art methods, obtaining dice per case (DPC) 69.4 % and global dice (DG) 80.0 % for liver tumor segmentation on the LiTS2017 dataset. Meanwhile, the pre-trained model based on the LiTS2017 datasets obtain, DPC 73.4 % and an DG 82.2 % on the 3DIRCADb dataset.

Authors

  • Huaxiang Liu
    Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China; Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China; Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China.
  • Jie Yang
    Key Laboratory of Development and Maternal and Child Diseases of Sichuan Province, Department of Pediatrics, Sichuan University, Chengdu, China.
  • Chao Jiang
    Zhejiang Provincial Key Laboratory of Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang, China.
  • Sailing He
    State Key Laboratory of Modern Optical Instrumentations, Centre for Optical and Electromagnetic Research, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
  • Youyao Fu
    Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China.
  • Shiqing Zhang
    Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China.
  • Xudong Hu
    School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, China.
  • Jiangxiong Fang
    Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China. Electronic address: fangchj202@163.com.
  • Wenbin Ji
    Department of Radiology, Taizhou Hospital, Zhejiang University, Taizhou, Zhejiang, China.