From electronic health records to terminology base: A novel knowledge base enrichment approach.

Journal: Journal of biomedical informatics
Published Date:

Abstract

Enriching terminology base (TB) is an important and continuous process, since formal term can be renamed and new term alias emerges all the time. As a potential supplementary for TB enrichment, electronic health record (EHR) is a fundamental source for clinical research and practise. The task to align the set of external terms in EHRs to TB can be regarded as entity alignment without structure information. Conventional approaches mainly use internal structural information of multiple knowledge bases (KBs) to map entities and their counterparts among KBs. However, the external terms in EHRs are independent clinical terms, which lack of interrelations. To achieve entity alignment in this case, we proposed a novel automatic TB enrichment approach, named semantic & structure embeddings-based relevancy prediction (S2ERP). To obtain the semantic embedding of external terms, we fed them with formal entity into a pre-trained language model. Meanwhile, a graph convolutional network was used to obtain the structure embeddings of the synonyms and hyponyms in TB. Afterwards, S2ERP combines both embeddings to measure the relevancy. Experimental results on clinical indicator TB, collected from 38 top-class hospitals of Shanghai Hospital Development Center, showed that the proposed approach outperforms baseline methods by 14.16% in Hits@1.

Authors

  • Jiaying Zhang
    School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
  • Zhixing Zhang
    School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
  • Huanhuan Zhang
    School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
  • Zhiyuan Ma
    Easton Cardiovascular Associates, Easton, PA, United States of America.
  • Qi Ye
    Department of Pharmaceutical Sciences, Northeastern University, Boston, MA 02115, USA.
  • Ping He
    Shanghai Hospital Development Center, Shanghai 200040, China. Electronic address: heping@shdc.org.cn.
  • Yangming Zhou
    School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China. Electronic address: ymzhou@ecust.edu.cn.