[Automatic labeling and extraction of terms in natural language processing in acupuncture clinical literature].

Journal: Zhongguo zhen jiu = Chinese acupuncture & moxibustion
Published Date:

Abstract

The paper analyzes the specificity of term recognition in acupuncture clinical literature and compares the advantages and disadvantages of three named entity recognition (NER) methods adopted in the field of traditional Chinese medicine. It is believed that the bi-directional long short-term memory networks-conditional random fields (Bi LSTM-CRF) may communicate the context information and complete NER by using less feature rules. This model is suitable for term recognition in acupuncture clinical literature. Based on this model, it is proposed that the process of term recognition in acupuncture clinical literature should include 4 aspects, i.e. literature pretreatment, sequence labeling, model training and effect evaluation, which provides an approach to the terminological structurization in acupuncture clinical literature.

Authors

  • Hua-Yun Liu
    Graduate School of Tianjin University of TCM, Tianjin 301617, China.
  • Chen-Jing Han
    Graduate School of Tianjin University of TCM, Tianjin 301617, China.
  • Jie Xiong
    Department of Laboratory Medicine, Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan 646000, China; Department of Laboratory Medicine, General Hospital of Chengdu Military Region, Chengdu, Sichuan 610083, China.
  • Hai-Yan Li
    Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences.
  • Lei Lei
    Division of Cardiology, Department of Internal Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Bao-Yan Liu
    China Academy of Chinese Medical Sciences, Beijing 100700.