Introducing high correlation and high quality instances for few-shot entity linking.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Entity linking, the process of connecting textual mentions in documents to canonical entities within a knowledge base, plays an integral role in a myriad of natural language processing tasks. A significant challenge prevalent within the field is the scarcity of resources, particularly for multiple specialized domains, which accentuates the importance of few-shot entity linking in real-world scenarios. Previous works address the problem of lacking in-domain labeled data by generating synthetic data. However, we argue that the synthetic data is frequently far from high-quality, such low-quality instances will introduce noise and diminish the ability of entity linking models to comprehend the semantic consistency between mentions and entities. In this paper, we propose a HFEL framework to introduce high correlation and high quality instances for few-shot entity linking. We argue that there are rich high-quality labeled data in general domains and some of them are highly correlated to the target domain. Thus, we first design an adversarial instance extraction module to extract such high-correlation instances without depending on additional manually annotated data. To further mitigate the negative effects brought by low-correlation instances, we train our entity linking model via a variant of curriculum learning. Experimental results on the few-shot entity linking dataset demonstrate the effectiveness of our proposed HFEL framework and it achieves state-of-the-art performance.

Authors

  • Xuhui Sui
    College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: suixuhui@dbis.nankai.edu.cn.
  • Ying Zhang
    Department of Nephrology, Nanchong Central Hospital Affiliated to North Sichuan Medical College, Nanchong, China.
  • Kehui Song
    School of Software, Tiangong University, Tianjin 300387, China. Electronic address: songkehui@dbis.nankai.edu.cn.
  • Baohang Zhou
    College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: zhoubaohang@dbis.nankai.edu.cn.
  • Xiaojie Yuan
    College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: yuanxj@nankai.edu.cn.