Rule-Based Natural Language Processing Pipeline to Detect Medication-Related Named Entities: Insights for Transfer Learning.

Journal: Studies in health technology and informatics
PMID:

Abstract

We document the procedure and performance of a rule-based NLP system that, using transfer learning, automatically extracts essential named entities related to drug errors from Japanese free-text incident reports. Subsequently, we used the rule-based annotated data to fine-tune a pre-trained BERT model and examined the performance of medication-related incident report prediction. The rule-based pipeline achieved a macro-F1-score of 0.81 in an internal dataset and the BERT model fine-tuned with rule-annotated data achieved a macro-F1-score of 0.97 and 0.75 for named entity recognition and relation extraction tasks, respectively. The model can be deployed to other, similar problems in medication-related clinical texts.

Authors

  • Zoie S Y Wong
    Graduate School of Public Health, St. Luke's International University, Tokyo, 104-0045, Japan. Electronic address: zoiesywong@gmail.com.
  • Neil Waters
    Graduate School of Public Health, St. Luke's International University, OMURA Susumu & Mieko Memorial St. Luke's Center for Clinical Academia, Japan.
  • Nicholas I-Hsien Kuo
    Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia. n.kuo@unsw.edu.au.
  • Jiaxing Liu
    School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, China.