An enhanced denoising system for mammogram images using deep transformer model with fusion of local and global features.

Journal: Scientific reports
PMID:

Abstract

Image denoising is a critical problem in low-level computer vision, where the aim is to reconstruct a clean, noise-free image from a noisy input, such as a mammogram image. In recent years, deep learning, particularly convolutional neural networks (CNNs), has shown great success in various image processing tasks, including denoising, image compression, and enhancement. While CNN-based approaches dominate, Transformer models have recently gained popularity for computer vision tasks. However, there have been fewer applications of Transformer-based models to low-level vision problems like image denoising. In this study, a novel denoising network architecture called DeepTFormer is proposed, which leverages Transformer models for the task. The DeepTFormer architecture consists of three main components: a preprocessing module, a local-global feature extraction module, and a reconstruction module. The local-global feature extraction module is the core of DeepTFormer, comprising several groups of ITransformer layers. Each group includes a series of Transformer layers, convolutional layers, and residual connections. These groups are tightly coupled with residual connections, which allow the model to capture both local and global information from the noisy images effectively. The design of these groups ensures that the model can utilize both local features for fine details and global features for larger context, leading to more accurate denoising. To validate the performance of the DeepTFormer model, extensive experiments were conducted using both synthetic and real noise data. Objective and subjective evaluations demonstrated that DeepTFormer outperforms leading denoising methods. The model achieved impressive results, surpassing state-of-the-art techniques in terms of key metrics like PSNR, FSIM, EPI, and SSIM, with values of 0.41, 0.93, 0.96, and 0.94, respectively. These results demonstrate that DeepTFormer is a highly effective solution for image denoising, combining the power of Transformer architecture with convolutional layers to enhance both local and global feature extraction.

Authors

  • A Robert Singh
    Department of Computational Intelligence, SRM Institute of Science and Technology, Chennai, Tamil Nadu, 603203, India.
  • Suganya Athisayamani
    School of Computing, Sastra Deemed to be University, Thanjavur, Tamil Nadu, 613401, India.
  • Faten Khalid Karim
    Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia.
  • Ahmed Zohair Ibrahim
    Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia.
  • Sameer Alshetewi
    General Information Technology Department, Ministry of Defense, The Executive Affairs, Excellence Services Directorate, P.O. Box 11564, 56688, Riyadh, Saudi Arabia.
  • Samih M Mostafa
    Faculty of Computers and Information, South Valley University, Qena 83523, Egypt.