CATransU-Net: Cross-attention TransU-Net for field rice pest detection.

Journal: PloS one
Published Date:

Abstract

Accurate detection of rice pests in field is a key problem in field pest control. U-Net can effectively extract local image features, and Transformer is good at dealing with long-distance dependencies. A Cross-Attention TransU-Net (CATransU-Net) model is constructed for paddy pest detection by combining U-Net and Transformer. It consists of encoder, decoder, dual Transformer-attention module (DTA) and cross-attention skip-connection (CASC), where dilated residual Inception (DRI) in encoder is adopted to extract the multiscale features, DTA is added into the bottleneck of the model to efficiently learn nonlocal interactions between encoder features, and CASC instead of skip-connection between encoder/decoder is designed to model the multi-resolution feature representation. Compared with U-Net and Transformer, CATransU-Net can extract multiscale features through DRI and DTA, and enhance feature representation to generate high-resolution insect images through CASC and decoder. The experimental results on the large-scale multiclass IP102 and AgriPest benchmark datasets verify that CATransU-Net is effective for rice pest extraction with precision of 93.51%, about 2% more than other methods, especially 9.36% more than U-Net. The proposed method can be applied to the field rice pest detection system. Code is available at https://github.com/chenchenchen23123121da/CATransU-Net.

Authors

  • Xuwei Lu
    Henan Agricultural Information Data Intelligent Engineering Research Center, SIAS University, Zhengzhou, China.
  • Yunlong Zhang
    Xi'an International University, Xi'an 710077, Shaanxi, China.
  • Congqi Zhang
    School of Software Engineering, Chengdu University of Technology, Chengdu, China.