Validity of Natural Language Processing for Ascertainment of and Test Results in SEER Cases of Stage IV Non-Small-Cell Lung Cancer.

Journal: JCO clinical cancer informatics
PMID:

Abstract

PURPOSE: SEER registries do not report results of epidermal growth factor receptor () and anaplastic lymphoma kinase () mutation tests. To facilitate population-based research in molecularly defined subgroups of non-small-cell lung cancer (NSCLC), we assessed the validity of natural language processing (NLP) for the ascertainment of EGFR and ALK testing from electronic pathology (e-path) reports of NSCLC cases included in two SEER registries: the Cancer Surveillance System (CSS) and the Kentucky Cancer Registry (KCR).

Authors

  • Bernardo Haddock Lobo Goulart
    Fred Hutchinson Cancer Research Center, Seattle, WA.
  • Emily T Silgard
    Fred Hutchinson Cancer Research Center, Seattle, WA.
  • Christina S Baik
    Fred Hutchinson Cancer Research Center, Seattle, WA.
  • Aasthaa Bansal
    The Comparative Health Outcomes, Policy & Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA. Electronic address: abansal@uw.edu.
  • Qin Sun
    Fred Hutchinson Cancer Research Center, Seattle, WA.
  • Eric B Durbin
    University of Kentucky, Lexington, KY.
  • Isaac Hands
    University of Kentucky, Lexington, KY.
  • Darshil Shah
    University of Kentucky, Lexington, KY.
  • Susanne M Arnold
    University of Kentucky, Lexington, KY.
  • Scott D Ramsey
    Fred Hutchinson Cancer Research Center, Seattle, WA.
  • Ramakanth Kavuluru
    Div. of Biomedical Informatics, Dept. of Internal Medicine, Dept. of Computer Science, University of Kentucky, Lexington, KY.
  • Stephen M Schwartz
    Fred Hutchinson Cancer Research Center, Seattle, WA.