Distant Supervision Relation Extraction via adaptive dependency-path and additional knowledge graph supervision.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Relation Extraction systems train an extractor by aligning relation instances in Knowledge Base with a large amount of labeled corpora. Since the labeled datasets are very expensive, Distant Supervision Relation Extraction (DSRE) utilizes rough corpus annotated with Knowledge Graph to reduce the cost of acquisition. Nevertheless, the data noise problem limits the performance of the DSRE. Dependency trees can be used to filter the wrong-labeled instances in the distant supervision bag. However, existing dependency tree relation extraction strategies are all based on manually-set paths between the subject and object entities, and suffer from the problem of pruning the trees too aggressively or too insufficiently. To circumvent the shortcomings, in this paper, we propose a novel DSRE framework ADSRE, based on the Adaptive dependency-path and Additional KG supervision. To obtain the dependency paths related to entity relations adaptively, we introduce an advanced graph neural network-GeniePath into DSRE, which assigns higher weights to those direct neighbor words that contribute more to relation prediction through breadth exploration, and conducts depth exploration to determine the correlation between relations and high-order neighbors. In this way, the irrelevant nodes are pruned while the relevant nodes are kept, our method can obtain more appropriate paths associated with relations. At the same time, to further reduce the noises in the data, we incorporate additional supervision information from the knowledge graph by retracting the margin between the representation of the bag and the pre-training knowledge graph embedding. Extensive numerical experiments validate the effectiveness of our new method.

Authors

  • Yong Shi
    Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China; College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA. Electronic address: yshi@ucas.ac.cn.
  • Yang Xiao
  • Pei Quan
    School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101408, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, 100190, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing, 100190, China. Electronic address: quanpei17@mails.ucas.ac.cn.
  • MingLong Lei
    Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China. Electronic address: leiml@bjut.edu.cn.
  • Lingfeng Niu
    School of Economics and Management, University of Chinese Academy of Sciences, Beijing, 100190, China. Electronic address: niulf@ucas.ac.cn.