Unleashing the Potential of Residual and Dual-Stream Transformers for the Remote Sensing Image Analysis.

Journal: Journal of imaging
Published Date:

Abstract

The categorization of remote sensing satellite imagery is crucial for various applications, including environmental monitoring, urban planning, and disaster management. Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have exhibited exceptional performance among deep learning techniques, excelling in feature extraction and representational learning. This paper presents a hybrid dual-stream ResV2ViT model that combines the advantages of ResNet50 V2 and Vision Transformer (ViT) architectures. The dual-stream approach allows the model to extract both local spatial features and global contextual information by processing data through two complementary pathways. The ResNet50V2 component is utilized for hierarchical feature extraction and captures short-range dependencies, whereas the ViT module efficiently models long-range dependencies and global contextual information. After position embedding in the hybrid model, the tokens are bifurcated into two parts: q1 and q2. q1 is passed into the convolutional block to refine local spatial details, and q2 is given to the Transformer to provide global attention to the spatial feature. Combining these two architectures allows the model to acquire low-level and high-level feature representations, improving classification performance. We assess the proposed ResV2ViT model using the RSI-CB256 dataset and another dataset with 21 classes. The proposed model attains an average accuracy of 99.91%, with precision and F1 score of 99.90% for the first dataset and 98.75% accuracy for the second dataset, illustrating its efficacy in satellite image classification. The findings demonstrate that the dual-stream hybrid ResV2ViT model surpasses traditional CNN and Transformer-based models, establishing it as a formidable framework for remote sensing applications.

Authors

  • Priya Mittal
    Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura 140401, Punjab, India.
  • Vishesh Tanwar
    Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura 140401, Punjab, India.
  • Bhisham Sharma
    Chitkara University School of Engineering and Technology, Chitkara University, Himachal Pradesh, India.
  • Dhirendra Prasad Yadav
    Department of Computer Engineering and Applications, GLA University, Mathura 281406, Uttar Pradesh, India.

Keywords

No keywords available for this article.