A novel UNet-SegNet and vision transformer architectures for efficient segmentation and classification in medical imaging.

Journal: Physical and engineering sciences in medicine
Published Date:

Abstract

Medical imaging has become an essential tool in the diagnosis and treatment of various diseases, and provides critical insights through ultrasound, MRI, and X-ray modalities. Despite its importance, challenges remain in the accurate segmentation and classification of complex structures owing to factors such as low contrast, noise, and irregular anatomical shapes. This study addresses these challenges by proposing a novel hybrid deep learning model that integrates the strengths of Convolutional Autoencoders (CAE), UNet, and SegNet architectures. In the preprocessing phase, a Convolutional Autoencoder is used to effectively reduce noise while preserving essential image details, ensuring that the images used for segmentation and classification are of high quality. The ability of CAE to denoise images while retaining critical features enhances the accuracy of the subsequent analysis. The developed model employs UNet for multiscale feature extraction and SegNet for precise boundary reconstruction, with Dynamic Feature Fusion integrated at each skip connection to dynamically weight and combine the feature maps from the encoder and decoder. This ensures that both global and local features are effectively captured, while emphasizing the critical regions for segmentation. To further enhance the model's performance, the Hybrid Emperor Penguin Optimizer (HEPO) was employed for feature selection, while the Hybrid Vision Transformer with Convolutional Embedding (HyViT-CE) was used for the classification task. This hybrid approach allows the model to maintain high accuracy across different medical imaging tasks. The model was evaluated using three major datasets: brain tumor MRI, breast ultrasound, and chest X-rays. The results demonstrate exceptional performance, achieving an accuracy of 99.92% for brain tumor segmentation, 99.67% for breast cancer detection, and 99.93% for chest X-ray classification. These outcomes highlight the ability of the model to deliver reliable and accurate diagnostics across various medical contexts, underscoring its potential as a valuable tool in clinical settings. The findings of this study will contribute to advancing deep learning applications in medical imaging, addressing existing research gaps, and offering a robust solution for improved patient care.

Authors

  • Simon Tongbram
    ECE Department, National Institute of Technology Manipur, Imphal, India. simontongbram@nitmanipur.ac.in.
  • Benjamin A Shimray
    Department of Electrical Engineering, National Institute of Technology Manipur, Manipur, India.
  • Loitongbam Surajkumar Singh
    ECE Department, National Institute of Technology Manipur, Imphal, India.

Keywords

No keywords available for this article.