Chinese crop diseases and pests named entity recognition based on variational information bottleneck and feature enhancement.
Journal:
Scientific reports
Published Date:
Aug 27, 2025
Abstract
Chinese crop diseases and pests named entity recognition (CCDP-NER) is a critical step in extracting domain-specific information in the field of crop diseases and pests, playing a significant role in promoting agricultural informatization. To address challenges such as noisy data, erroneous annotations, and ambiguous entity boundaries in the crop disease and pest domain, this study proposes a deep learning-based CCDP-NER model. The model employs a bidirectional gated recurrent Unit (BiGRU) to capture long-range semantic dependencies and integrates multi-level dilated convolutional neural networks (DCNNs) to extract local fine-grained features, thereby constructing a global-local collaborative representation. Innovatively, the variational information bottleneck (VIB) technique is introduced to filter noise by constraining mutual information, reducing the impact of input noise on feature extraction while simultaneously enhancing the correlation between extracted features and labels, thereby improving model robustness. Additionally, an entity boundary detection module is incorporated to identify the head and tail positions of entities, enhancing boundary recognition accuracy. Experiments conducted on a constructed crop diseases and pests dataset demonstrate that the proposed model effectively identifies crop disease and pest entities, achieving an F1-score of 90.64%. This research holds significant value for applications such as agricultural knowledge graph construction and agricultural question-answering systems.