Clinical text annotation - what factors are associated with the cost of time?

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium

Published Date: Dec 5, 2018

Abstract

Building high-quality annotated clinical corpora is necessary for developing statistical Natural Language Processing (NLP) models to unlock information embedded in clinical text, but it is also time consuming and expensive. Consequently, it important to identify factors that may affect annotation time, such as syntactic complexity of the text- to-be-annotated and the vagaries of individual user behavior. However, limited work has been done to understand annotation of clinical text. In this study, we aimed to investigate how factors inherent to the text affect annotation time for a named entity recognition (NER) task. We recruited 9 users to annotate a clinical corpus and recorded annotation time for each sample. Then we defined a set of factors that we hypothesized might affect annotation time, and fitted them into a linear regression model to predict annotation time. The linear regression model achieved an R of 0.611, and revealed eight time-associated factors, including characteristics of sentences, individual users, and annotation order with implications for the practice of annotation, and the development of cost models for active learning research.

Authors

Qiang Wei

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
Amy Franklin

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
Trevor Cohen

University of Washington, Seattle, WA.
Hua Xu

Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.

Keywords

Data Mining Electronic Health Records Humans Linear Models Natural Language Processing Semantics Time Factors Workload

External Resources

View on PubMed PubMed (30815201)

Clinical text annotation - what factors are associated with the cost of time?

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals