De-identification of clinical free text using natural language processing: A systematic review of current approaches.
Journal:
Artificial intelligence in medicine
Published Date:
Mar 20, 2024
Abstract
BACKGROUND: Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.