Is Multiclass Automatic Text De-Identification Worth the Effort?

Journal: Methods of information in medicine

Published Date: Sep 24, 2018

Abstract

OBJECTIVES: Automatic de-identification to remove protected health information (PHI) from clinical text can use a "binary" model that replaces redacted text with a generic tag (e.g., ""), or can use a "multiclass" model that retains more class information (e.g., ""). Binary models are easier to develop, but result in text that is potentially less informative. We investigated whether building a multiclass de-identification is worth the extra effort.

Authors

Duy Duc An Bui
David T Redden
James J Cimino

Keywords

Algorithms Data Anonymization Electronic Health Records False Positive Reactions Humans

External Resources

View on PubMed Access via DOI PubMed (30919392)

Is Multiclass Automatic Text De-Identification Worth the Effort?

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals