Bladder lesion detection using EfficientNet and hybrid attention transformer through attention transformation.

Journal: Scientific reports
Published Date:

Abstract

Bladder cancer diagnosis is a challenging task because of its intricacy and variation of tumor features. Moreover, morphological similarities of the cancerous cells make manual diagnosis time-consuming. Recently, machine learning and deep learning methods have been utilized to diagnose bladder cancer. However, manual feature requirements for machine learning and the high volume of data for deep learning make them less reliable for real-time application. This study developed a hybrid model using CNN (Convolutional Neural Network) and less attention-based ViT (Vision Transformer) for bladder lesion diagnosis. Our hybrid model contains two blocks of the inceptionV3 to extract spatial features. Furthermore, the global co-relation of the features is achieved using hybrid attention modules incorporated in the ViT encoder. The experimental evaluation of the model on a dataset consisting of 17,540 endoscopic images achieved an average accuracy, precision and F1-score of 97.73%, 97.21% and 96.86%, respectively, using a 5-fold cross-validation strategy. We compared the results of the proposed method with CNN and ViT-based methods under the same experimental condition, and we achieved much better performance than our counterparts.

Authors

  • Poonam Sharma
    2Nexgen Precision, Dallas, TX.
  • Bhisham Sharma
    Chitkara University School of Engineering and Technology, Chitkara University, Himachal Pradesh, India.
  • Dhirendra Prasad Yadav
    Department of Computer Engineering and Applications, GLA University, Mathura 281406, Uttar Pradesh, India.
  • Deepti Thakral
    Department of Computer Science and Technology, Manav Rachna University, Faridabad, India.
  • Julian L Webber
    Graduate School of Engineering Science, Osaka University, Osaka, Japan.