CMCN: Chinese medical concept normalization using continual learning and knowledge-enhanced.

Journal: Artificial intelligence in medicine
Published Date:

Abstract

Medical Concept Normalization (MCN) is a crucial process for deep information extraction and natural language processing tasks, which plays a vital role in biomedical research. Although MCN in English has achieved significant research achievements, Chinese medical concept normalization (CMCN) remains insufficiently explored due to its complex syntactic structure and the paucity of Chinese medical semantic and ontology resources. In recent years, deep learning has been extensively applied across numerous natural language processing tasks, owing to its robust learning capabilities, adaptability, and transferability. It has proven to be well suited for intricate and specialized knowledge discovery research in the biomedical field. In this study, we conduct research on CMCN through the lens of deep learning. Specifically, our research introduces a model that leverages polymorphic semantic information and knowledge enhanced through multi-task learning and retain more important medical features through continual learning. As the cornerstone of CMCN, disease names are the main focus of this research. We evaluated various methodologies on Chinese disease dataset built by ourselves, finally achieving 76.12 % on Accuracy@1, 87.20 % on Accuracy@5 and 90.02 % on Accuracy@10 with our best-performing model GCBM-BSCL. This research not only advances the fields of knowledge mining and medical concept normalization but also enhances the integration and application of artificial intelligence in the medical and health field. We have published the source code and results on https://github.com/BearLiX/CMCN.

Authors

  • Pu Han
    CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China.
  • Xiong Li
    School of Software, East China Jiaotong University, Nanchang, 330013, China.
  • Zhanpeng Zhang
    Macromolecular Science and Engineering Center, University of Michigan, Ann Arbor, MI 48109, USA.
  • Yule Zhong
    School of Management, Nanjing University of Posts & Telecommunications, Nanjing 210003, China.
  • Liang Gu
    School of Management, Nanjing University of Posts & Telecommunications, Nanjing 210003, China.
  • Yingying Hua
    School of Management, Nanjing University of Posts & Telecommunications, Nanjing 210003, China.
  • Xiaoyan Li
    Shulan International Medical College, Zhejiang Shuren University, Hangzhou, Zhejiang, China.