High precision banana variety identification using vision transformer based feature extraction and support vector machine.
Journal:
Scientific reports
PMID:
40133576
Abstract
Bananas, renowned for their delightful flavor, exceptional nutritional value, and digestibility, are among the most widely consumed fruits globally. The advent of advanced image processing, computer vision, and deep learning (DL) techniques has revolutionized agricultural diagnostics, offering innovative and automated solutions for detecting and classifying fruit varieties. Despite significant progress in DL, the accurate classification of banana varieties remains challenging, particularly due to the difficulty in identifying subtle features at early developmental stages. To address these challenges, this study presents a novel hybrid framework that integrates the Vision Transformer (ViT) model for global semantic feature representation with the robust classification capabilities of Support Vector Machines. The proposed framework was rigorously evaluated on two datasets: the four-class BananaImageBD and the six-class BananaSet. To mitigate data imbalance issues, a robust evaluation strategy was employed, resulting in a remarkable classification accuracy rate (CAR) of 99.86%[Formula: see text]0.099 for BananaSet and 99.70%[Formula: see text]0.17 for BananaImageBD, surpassing traditional methods by a margin of 1.77%. The ViT model, leveraging self-supervised and semi-supervised learning mechanisms, demonstrated exceptional promise in extracting nuanced features critical for agricultural applications. By combining ViT features with cutting-edge machine learning classifiers, the proposed system establishes a new benchmark in precision and reliability for the automated detection and classification of banana varieties. These findings underscore the potential of hybrid DL frameworks in advancing agricultural diagnostics and pave the way for future innovations in the domain.