Cost-aware active learning for named entity recognition in clinical text.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: Active Learning (AL) attempts to reduce annotation cost (ie, time) by selecting the most informative examples for annotation. Most approaches tacitly (and unrealistically) assume that the cost for annotating each sample is identical. This study introduces a cost-aware AL method, which simultaneously models both the annotation cost and the informativeness of the samples and evaluates both via simulation and user studies.

Authors

  • Qiang Wei
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  • Yukun Chen
    Department of Biomedical Informatics, Vanderbilt University, School of Medicine, Nashville, TN, USA.
  • Mandana Salimi
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Joshua C Denny
    Vanderbilt University, Nashville, TN.
  • Qiaozhu Mei
    University of Michigan, Ann Arbor, MI.
  • Thomas A Lasko
    Vanderbilt University School of Medicine, Nashville, TN.
  • Qingxia Chen
    Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA.
  • Stephen Wu
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  • Amy Franklin
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  • Trevor Cohen
    University of Washington, Seattle, WA.
  • Hua Xu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.