A text-speech multimodal Chinese named entity recognition model for crop diseases and pests.

Journal: Scientific reports
PMID:

Abstract

Named Entity Recognition for crop diseases and pests (NER-CDP) is significant in agricultural information extraction and offers vital data support for subsequent knowledge services and retrieval. However, existing NER-CDP methods rely heavily on plain text or external features such as radicals and font types and have limited effect on improving word segmentation. In this paper, we propose a multimodal named entity recognition model (CDP-MCNER) based on cross-modal attention to solve the issue of the performance degradation of the NER model caused by potential word segmentation errors. We introduce audio modality information into the field of NER-CDP for the first time and use the pauses in audio sentences to assist Chinese word segmentation. The CDP-MCNER model adopts cross-modal attention as the main architecture to fully integrate the textual and acoustic modalities. Then some data augmentation techniques, such as introducing disturbances in the text encoder, and frequency domain enhancement in the acoustic encoder are used to enhance the diversity of multimodal inputs. To improve the accuracy of the prediction label, the Masked CTC (Connectionist Temporal Classification) Loss is used to further align the multimodal semantic representation. In the experiment studies, we compare with classical text-only models, lexicon-enhanced models, and multimodal models, our model achieves the optimal precision, recall, and F score of 91.32%, 93.05%, and 92.18%, respectively. Furthermore, the optimal F scores of our method are 81.05% and 79.23% based on the public domain datasets, CNERTA and Ai-SHELL. The experimental results show the effectiveness and generalization of the CDP-MCNER model in the task of NER-CDP.

Authors

  • Ruilin Liu
    Shanghai Tongji Hospital, Tongji University School of Medicine, Shanghai, China.
  • Xuchao Guo
    School of Information Science and Engineering, Shandong Agricultural University, Taian, Shandong, China.
  • HongMei Zhu
    School of Information Science and Engineering, Shandong Agricultural University, Taian, Shandong, China.
  • Lu Wang
    Department of Laboratory, Akesu Center of Disease Control and Prevention, Akesu, China.