Identifying protected health information by transformers-based deep learning approach in Chinese medical text.

Journal: Health informatics journal
PMID:

Abstract

In the context of Chinese clinical texts, this paper aims to propose a deep learning algorithm based on Bidirectional Encoder Representation from Transformers (BERT) to identify privacy information and to verify the feasibility of our method for privacy protection in the Chinese clinical context. We collected and double-annotated 33,017 discharge summaries from 151 medical institutions on a municipal regional health information platform, developed a BERT-based Bidirectional Long Short-Term Memory Model (BiLSTM) and Conditional Random Field (CRF) model, and tested the performance of privacy identification on the dataset. To explore the performance of different substructures of the neural network, we created five additional baseline models and evaluated the impact of different models on performance. Based on the annotated data, the BERT model pre-trained with the medical corpus showed a significant performance improvement to the BiLSTM-CRF model with a micro-recall of 0.979 and an F1 value of 0.976, which indicates that the model has promising performance in identifying private information in Chinese clinical texts. The BERT-based BiLSTM-CRF model excels in identifying privacy information in Chinese clinical texts, and the application of this model is very effective in protecting patient privacy and facilitating data sharing.

Authors

  • Kun Xu
    Department of Hygienic Inspection, School of Public Health, Jilin University 1163 Xinmin Street Changchun 130021 Jilin China songxiuling@jlu.edu.cn li_juan@jlu.edu.cn jinmh@jlu.edu.cn +86 43185619441.
  • Yang Song
    Biomedical and Multimedia Information Technology (BMIT) Research Group, School of IT, University of Sydney, NSW 2006, Australia. Electronic address: yson1723@uni.sydney.edu.au.
  • Jingdong Ma
    School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Hubei, China. Electronic address: jdma@hust.edu.cn.