Biomedical document-level relation extraction with thematic capture and localized entity pooling.

Journal: Journal of biomedical informatics
Published Date:

Abstract

In contrast to sentence-level relational extraction, document-level relation extraction poses greater challenges as a document typically contains multiple entities, and one entity may be associated with multiple other entities. Existing methods often rely on graph structures to capture path representations between entity pairs. However, this paper introduces a novel approach called local entity pooling that solely relies on the pre-training model to identify the bridge entity related to the current entity pair and generate the reasoning path representation. This technique effectively mitigates the multi-entity problem. Additionally, the model leverages the multi-entity and multi-label characteristics of the document to acquire the document's thematic representation, thereby enhancing the document-level relation extraction task. Experimental evaluations conducted on two biomedical datasets, CDR and GDA. Our TCLEP (Thematic Capture and Localized Entity Pooling) model achieved the Macro-F1 scores of 71.7% and 85.3%, respectively. Simultaneously, we incorporated local entity pooling and thematic capture modules into the state-of-the-art model, resulting in performance improvements of 1.5% and 0.2% on the respective datasets. These results highlight the advanced performance of our proposed approach.

Authors

  • Yuqing Li
    Deep Space Exploration Research Center, 47822Harbin Institute of Technology, Harbin, China.
  • Xinhui Shao
    Department of Mathematics, College of Sciences, Northeastern University, Shenyang, China. Electronic address: xinhui1002@126.com.