WSDC-ViT: a novel transformer network for pneumonia image classification based on windows scalable attention and dynamic rectified linear unit convolutional modules.

Journal: Scientific reports
Published Date:

Abstract

Accurate differential diagnosis of pneumonia remains a challenging task, as different types of pneumonia require distinct treatment strategies. Early and precise diagnosis is crucial for minimizing the risk of misdiagnosis and for effectively guiding clinical decision-making and monitoring treatment response. This study proposes the WSDC-ViT network to enhance computer-aided pneumonia detection and alleviate the diagnostic workload for radiologists. Unlike existing models such as Swin Transformer or CoAtNet, which primarily improve attention mechanisms through hierarchical designs or convolutional embedding, WSDC-ViT introduces a novel architecture that simultaneously enhances global and local feature extraction through a scalable self-attention mechanism and convolutional refinement. Specifically, the network integrates a scalable self-attention mechanism that decouples the query, key, and value dimensions to reduce computational overhead and improve contextual learning, while an interactive window-based attention module further strengthens long-range dependency modeling. Additionally, a convolution-based module equipped with a dynamic ReLU activation function is embedded within the transformer encoder to capture fine-grained local details and adaptively enhance feature expression. Experimental results demonstrate that the proposed method achieves an average classification accuracy of 95.13% and an F1-score of 95.63% on a chest X-ray dataset, along with 99.36% accuracy and a 99.34% F1-score on a CT dataset. These results highlight the model's superior performance compared to existing automated pneumonia classification approaches, underscoring its potential clinical applicability.

Authors

  • Yu Gu
    Microsoft Research, Redmond, WA, USA.
  • Haotian Bai
    Key Laboratory of Organic Solids, Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences, Beijing, 100190, P.R. China.
  • Meng Chen
    Institute of Industrial and Consumer Product Safety, China Academy of Inspection and Quarantine, Beijing, China.
  • Lidong Yang
    Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, China.
  • Baohua Zhang
    Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, China.
  • Jing Wang
    Endoscopy Center, Peking University Cancer Hospital and Institute, Beijing, China.
  • Xiaoqi Lu
    School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China; Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, China. Electronic address: lxiaoqi@imut.edu.cn.
  • Jianjun Li
    Rehabilitation Clinic, Shenzhen University General Hospital, Shenzhen, Guangdong, China.
  • Xin Liu
    Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences, Weifang, Shandong, China.
  • Dahua Yu
    Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, China.
  • Ying Zhao
    Department of Pharmacy, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Siyuan Tang
    Baotou Medical College, Inner Mongolia University of Science and Technology, Baotou, China.
  • Qun He
    Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, China.