Open-world semi-supervised relation extraction.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 39987714

Abstract

Semi-supervised Relation Extraction methods play an important role in extracting relationships from unstructured text, which can leverage both labeled and unlabeled data to improve extraction accuracy. However, these methods are grounded under the closed-world assumption, in which the relationship types of labeled and unlabeled data belong to the same closed set, that are not applicable to real-world scenarios that involve novel relationships. To address this issue, this paper proposes an open-world semi-supervised relation extraction task and a novel method, Seen relation Identification and Novel relation Discovery (SIND), to extract both seen and novel relations simultaneously. Specifically, SIND develops a contrastive learning strategy to improve the semantic representation of relations and incorporates a cluster-aware method for discovering novel relations by leveraging the pairwise similarity between samples in the feature space. Additionally, SIND utilizes the maximum entropy theory as the prior distribution to address the learning pace imbalance problem caused by the absence of labeled data for novel classes. Experimental results on three widely used benchmark datasets demonstrate that SIND achieves significant improvements over baseline models. This study provides an exploration to address the challenge of discovering relationships within unannotated data and presents a reference approach for various natural language processing tasks, such as text classification and named entity recognition, in open-world scenarios. The datasets and source code of this work are available at https://github.com/a-home-bird/SIND.

Authors

Diange Zhou

Arthritis Clinical and Research Center, Peking University People's Hospital, Peking University, Beijing, China.
Yilin Duan

School of Computer Science, China University of Geosciences, Wuhan, 430078, China. Electronic address: duanyl@cug.edu.cn.
Shengwen Li

School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China. swli@cug.edu.cn.
Hong Yao

School of Computer Science, China University of Geosciences, Wuhan 430074, China. yaohong@cug.edu.cn.

Keywords

Data Mining Humans Natural Language Processing Neural Networks, Computer Semantics Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (39987714)

Open-world semi-supervised relation extraction.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals