Automated Learning of Temporal Expressions.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Clinical notes contain important temporal information that are critical for making clinical diagnosis and treatment as well as for retrospective analyses. Manually created regular expressions are commonly used for the extraction of temporal information; however, this can be a time consuming and brittle approach. We describe a novel algorithm for automatic learning of regular expressions in recognizing temporal expressions. Five classes of temporal expressions are identified. Keywords specific to those classes are used to retrieve snippets of text representing the same keywords in context. Those snippets are used for Regular Expression Discovery Extraction (REDEx). These learned regular expressions are then evaluated using 10-fold cross validation. Precision and recall are very high, above 0.95 for most classes.

Authors

  • Douglas Redd
    University of Utah, Salt Lake City, Utah, USA.
  • YiJun Shaoa
    University of Utah, Salt Lake City, Utah, USA.
  • Jing Yang
    Beijing Novartis Pharma Co. Ltd., Beijing, China.
  • Guy Divita
    VA Salt Lake City Health Care System, Salt Lake City, Utah, USA.
  • Qing Zeng-Treitler
    Veterans Affairs Medical Center, Washington, DC; George Washington University, Washington, DC.