Improving clinical named entity recognition in Chinese using the graphical and phonetic feature.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Clinical Named Entity Recognition is to find the name of diseases, body parts and other related terms from the given text. Because Chinese language is quite different with English language, the machine cannot simply get the graphical and phonetic information form Chinese characters. The method for Chinese should be different from that for English. Chinese characters present abundant information with the graphical features, recent research on Chinese word embedding tries to use graphical information as subword. This paper uses both graphical and phonetic features to improve Chinese Clinical Named Entity Recognition based on the presence of phono-semantic characters.

Authors

  • Yifei Wang
    Department of Dermatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Sophia Ananiadou
  • Jun'ichi Tsujii