An Explainable Vision Transformer with Transfer Learning Combined with Support Vector Machine Based Efficient Drought Stress Identification
Journal:
arXiv
Published Date:
Jul 31, 2024
Abstract
Early detection of drought stress is critical for taking timely measures for
reducing crop loss before the drought impact becomes irreversible. The subtle
phenotypical and physiological changes in response to drought stress are
captured by non-invasive imaging techniques and these imaging data serve as
valuable resource for machine learning methods to identify drought stress.
While convolutional neural networks (CNNs) are in wide use, vision transformers
(ViTs) present a promising alternative in capturing long-range dependencies and
intricate spatial relationships, thereby enhancing the detection of subtle
indicators of drought stress. We propose an explainable deep learning pipeline
that leverages the power of ViTs for drought stress detection in potato crops
using aerial imagery. We applied two distinct approaches: a synergistic
combination of ViT and support vector machine (SVM), where ViT extracts
intricate spatial features from aerial images, and SVM classifies the crops as
stressed or healthy and an end-to-end approach using a dedicated classification
layer within ViT to directly detect drought stress. Our key findings explain
the ViT model's decision-making process by visualizing attention maps. These
maps highlight the specific spatial features within the aerial images that the
ViT model focuses as the drought stress signature. Our findings demonstrate
that the proposed methods not only achieve high accuracy in drought stress
identification but also shedding light on the diverse subtle plant features
associated with drought stress. This offers a robust and interpretable solution
for drought stress monitoring for farmers to undertake informed decisions for
improved crop management.