Tell me your position: Distantly supervised biomedical entity relation extraction using entity position marker.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

A significant amount of textual data has been produced in the biomedical area recently as a result of the advancement of biomedical technologies. Large-scale biomedical data can be automatically obtained with the help of distant supervision. However, the noisy data brought by distant supervision methods makes relation extraction tasks more difficult. Previous work has focused more on how to restore mislabeled relationships, but little attention has been paid to the importance of labeled entity locations for relationship extraction tasks. In this paper, we present a "four-stage" model based on BioBERT and Multi-Instance Learning by using entity position markers. Firstly, the sentence is marked with position. Secondly, BioBERT, a biomedical pre-trained language model, is used in the final sentence feature vector representation not only with the global position marker but also with the start and end marker of both the head and tail entity. Thirdly, the aggregation of sentence vectors in the bag is used as the vector feature of the bag by three aggregation methods, and the performance of different sentence feature vectors combined with different bag encoding methods is discussed. At last, relation classification is performed at the bag level. According to experimental results, the presented model significantly outperforms all baseline models and contributes to noise reduction. In addition, different bag encoding methods need to match corresponding sentence encoding representation to achieve the best performance.

Authors

  • Jiran Zhu
    School of Information Science and Engineering, Shandong Normal University, Jinan, China.
  • Jikun Dong
    School of Information Science and Engineering, Shandong Normal University, Jinan, China.
  • Hongyun Du
    School of Information Science and Engineering, Shandong Normal University, Jinan, China.
  • Yanfang Geng
    School of Information Science and Engineering, Shandong Normal University, Jinan, China.
  • Shengyu Fan
    School of Information Science and Engineering, Shandong Normal University, Jinan, China.
  • Hui Yu
    Engineering Technology Research Center of Shanxi Province for Opto-Electric Information and Instrument, Taiyuan 030051, China. 13934603474@nuc.edu.cn.
  • Zengzhen Shao
    School of Data and Computer Science, Shandong Women's University, Jinan, China.
  • Xia Wang
    Department of Neurology, The Sixth People's Hospital of Huizhou City, Huizhou, China.
  • Yaping Yang
    AiLife Diagnostics, Pearland, TX, USA.
  • Weizhi Xu
    School of Information Science and Engineering, Shandong Normal University, Jinan, China. Electronic address: xuweizhi@sdnu.edu.cn.