HybridCTrm: Bridging CNN and Transformer for Multimodal Brain Image Segmentation.

Journal: Journal of healthcare engineering
Published Date:

Abstract

Multimodal medical image segmentation is always a critical problem in medical image segmentation. Traditional deep learning methods utilize fully CNNs for encoding given images, thus leading to deficiency of long-range dependencies and bad generalization performance. Recently, a sequence of Transformer-based methodologies emerges in the field of image processing, which brings great generalization and performance in various tasks. On the other hand, traditional CNNs have their own advantages, such as rapid convergence and local representations. Therefore, we analyze a hybrid multimodal segmentation method based on Transformers and CNNs and propose a novel architecture, HybridCTrm network. We conduct experiments using HybridCTrm on two benchmark datasets and compare with HyperDenseNet, a network based on fully CNNs. Results show that our HybridCTrm outperforms HyperDenseNet on most of the evaluation metrics. Furthermore, we analyze the influence of the depth of Transformer on the performance. Besides, we visualize the results and carefully explore how our hybrid methods improve on segmentations.

Authors

  • Qixuan Sun
    Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China.
  • Nianhua Fang
    Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China.
  • Zhuo Liu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Liang Zhao
    Graduate School of Advanced Integrated Studies in Human Survivability (Shishu-Kan), Kyoto University, Kyoto, Japan.
  • Youpeng Wen
    Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China.
  • Hongxiang Lin
    Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China.