An attention based hybrid approach using CNN and BiLSTM for improved skin lesion classification.

Journal: Scientific reports
PMID:

Abstract

Skin lesions remain a significant global health issue, with their incidence rising steadily over the past few years. Early and accurate detection is crucial for effective treatment and improving patient outcomes. This work explores the integration of advanced Convolutional Neural Networks (CNNs) with Bidirectional Long Short Term Memory (BiLSTM) enhanced by spatial, channel, and temporal attention mechanisms to improve the classification of skin lesions. The hybrid model is trained to distinguish between various skin lesions with high precision. Among the models evaluated, the CNN (original architecture) with BiLSTM and attention mechanisms model achieved the highest performance, with an accuracy of 92.73%, precision of 92.84%, F1 score of 92.70%, recall of 92.73%, Jaccard Index (JAC) of 87.08%, Dice Coefficient (DIC) of 92.70%, and Matthews Correlation Coefficient (MCC) of 91.55%. The proposed model was compared to other configurations, including CNN with Gated Recurrent Units (GRU) and attention mechanisms, CNN with LSTM and attention mechanisms, CNN with BiGRU and attention mechanisms, CNN with BiLSTM, CNN with LSTM, CNN with BiGRU, CNN with GRU, standalone CNN, InceptionV3, Visual Geometry Group-16 (VGG16), and Xception, to highlight the efficacy of the proposed approach. This research aims to empower healthcare professionals by providing a robust diagnostic tool that enhances accuracy and supports proactive management strategies. The model's ability to analyze high-resolution images and capture complex features of skin lesions promises significant advancements in early detection and personalized treatment. This work not only seeks to advance the technological capabilities in skin lesion diagnostics but also aims to mitigate the disease's impact through timely interventions and improved healthcare outcomes, ultimately enhancing public health resilience on a global scale.

Authors

  • Ayesha Shaik
    Division of Cardiology, Hartford Hospital, Hartford, Connecticut, USA.
  • Shivanya Shomir Dutta
    School of Computer Science and Engineering, Vellore Institute of Technology (VIT), Chennai, 600127, India.
  • Ishaan Milind Sawant
    School of Computer Science and Engineering, Vellore Institute of Technology (VIT), Chennai, 600127, India.
  • Shreyas Kumar
    School of Computer Science and Engineering, Vellore Institute of Technology (VIT), Chennai, 600127, India.
  • Ananthakrishnan Balasundaram
    Centre for Cyber Physical Systems, Vellore Institute of Technology, Chennai, 600127, India. balasundaram.a@vit.ac.in.
  • Kanjar De
    Video Coding Systems, Fraunhofer Heinrich-Hertz-Institut, Berlin, Germany.