Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics.

Journal: JCO clinical cancer informatics
Published Date:

Abstract

PURPOSE: Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other institutions.

Authors

  • Tomasz Oliwa
    The University of Chicago, Chicago, IL.
  • Steven B Maron
    Memorial Sloan Kettering Cancer Center, New York, NY.
  • Leah M Chase
    The University of Chicago Medical Center, Chicago, IL.
  • Samantha Lomnicki
    The University of Chicago Medical Center, Chicago, IL.
  • Daniel V T Catenacci
    The University of Chicago Medical Center, Chicago, IL.
  • Brian Furner
    Pediatrics, University of Chicago, Chicago, IL, USA.
  • Samuel L Volchenboum
    The University of Chicago, Chicago, IL.