A Unified Model Using Distantly Supervised Data and Cross-Domain Data in NER.

Journal: Computational intelligence and neuroscience

Published Date: May 29, 2022

Abstract

Named entity recognition (NER) systems are often realized by supervised methods that require large hand-annotated data. When the hand-annotated data is limited, distantly supervised (DS) data and cross-domain (CD) data are usually used separately to improve the performance. The distantly supervised data can provide in-domain dictionary information, and the hand-annotated cross-domain information can be provided by cross-domain data. These two types of information are complemental. However, there are two problems required to be solved before using directly. First, the distantly supervised data may contain a lot of noise. Second, directly using cross-domain data may degrade performance due to the distribution mismatching problem. In this paper, we propose a unified model named PARE (PArtial learning and REinforcement learning). The PARE model can simultaneously use distantly supervised data and cross-domain data as external data. The model uses the partial learning method with a new label strategy to better handle the noise in distantly supervised data. The reinforcement learning method is used to alleviate the distribution mismatching problem in cross-domain data. Experiments in three datasets show that our model outperforms other baseline models. Besides, our model can be used in the situation where no hand-annotated in-domain data is provided.

Authors

Yun Hu

Department of Radiology, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, 26 Shengli Avenue, Jiangan, Wuhan, 430014, Hubei, China.
Hao He

School of Aerospace Engineering , Xiamen University , Xiamen 361005 , P. R. China.
Zhengfei Chen

Shenzhen Power Supply Bureau Co., Ltd., Shenzhen 518001, China.
Qingmeng Zhu

Institute of Software, Chinese Academy of Sciences, Haidian, Beijing 100190, China.
Changwen Zheng

Institute of Software, Chinese Academy of Sciences, Haidian, Beijing 100190, China.

Keywords

Learning Machine Learning Recognition, Psychology

External Resources

View on PubMed Access via DOI PubMed (35676955)

A Unified Model Using Distantly Supervised Data and Cross-Domain Data in NER.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals