Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study.

Journal: Journal of medical Internet research

Published Date: Jan 25, 2024

Abstract

BACKGROUND: The widespread use of electronic health records in the clinical and biomedical fields makes the removal of protected health information (PHI) essential to maintain privacy. However, a significant portion of information is recorded in unstructured textual forms, posing a challenge for deidentification. In multilingual countries, medical records could be written in a mixture of more than one language, referred to as code mixing. Most current clinical natural language processing techniques are designed for monolingual text, and there is a need to address the deidentification of code-mixed text.

Authors

You-Qian Lee

Dialogue System Technical Department, Intelligent Robot, Asustek Computer Inc, Taipei, Taiwan.
Ching-Tai Chen

Institute of Information Science, Academia Sinica, 115, Taipei City, Taiwan.
Chien-Chang Chen

Bio-Microsystems Integration Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, Taiwan.
Chung-Hong Lee

Knowledge Discovery and Data Mining Lab, Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.
Peitsz Chen

Department of Chemical Engineering, Feng Chia University, Taichung, Taiwan.
Chi-Shin Wu

Department of Psychiatry, National Taiwan University Hospital, Taipei, Taiwan R.O.C.
Hong-Jie Dai

Department of Computer Science and Information Engineering, National Taitung University, Taiwan. Electronic address: hjdai@nttu.edu.tw.

Keywords

Artificial Intelligence China Electronic Health Records Humans Natural Language Processing Privacy

External Resources

View on PubMed Access via DOI PubMed (38271060)

Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals