CoTrFuse: a novel framework by fusing CNN and transformer for medical image segmentation.

Journal: Physics in medicine and biology
Published Date:

Abstract

Medical image segmentation is a crucial and intricate process in medical image processing and analysis. With the advancements in artificial intelligence, deep learning techniques have been widely used in recent years for medical image segmentation. One such technique is the U-Net framework based on the U-shaped convolutional neural networks (CNN) and its variants. However, these methods have limitations in simultaneously capturing both the global and the remote semantic information due to the restricted receptive domain caused by the convolution operation's intrinsic features. Transformers are attention-based models with excellent global modeling capabilities, but their ability to acquire local information is limited. To address this, we propose a network that combines the strengths of both CNN and Transformer, called CoTrFuse. The proposed CoTrFuse network uses EfficientNet and Swin Transformer as dual encoders. The Swin Transformer and CNN Fusion module are combined to fuse the features of both branches before the skip connection structure. We evaluated the proposed network on two datasets: the ISIC-2017 challenge dataset and the COVID-QU-Ex dataset. Our experimental results demonstrate that the proposed CoTrFuse outperforms several state-of-the-art segmentation methods, indicating its superiority in medical image segmentation. The codes are available athttps://github.com/BinYCn/CoTrFuse.

Authors

  • Yuanbin Chen
    College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China.
  • Tao Wang
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Hui Tang
    Department of Pharmacy, The Affiliated Hospital of Southwest Medical University, Luzhou, China.
  • Longxuan Zhao
    College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China.
  • Xinlin Zhang
    Department of Electronic Science, Biomedical Intelligent Cloud Research and Development Center, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China.
  • Tao Tan
    Faculty of Applied Sciences, Macao Polytechnic University, Macao, China.
  • Qinquan Gao
    College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China.
  • Min Du
    College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China.
  • Tong Tong
    CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.