Identification of pancreatic cancer risk factors from clinical notes using natural language processing.

Journal: Pancreatology : official journal of the International Association of Pancreatology (IAP) ... [et al.]
Published Date:

Abstract

OBJECTIVES: Screening for pancreatic ductal adenocarcinoma (PDAC) is considered in high-risk individuals (HRIs) with established PDAC risk factors, such as family history and germline mutations in PDAC susceptibility genes. Accurate assessment of risk factor status is provider knowledge-dependent and requires extensive manual chart review by experts. Natural Language Processing (NLP) has shown promise in automated data extraction from the electronic health record (EHR). We aimed to use NLP for automated extraction of PDAC risk factors from unstructured clinical notes in the EHR.

Authors

  • Dhruv Sarwal
    Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
  • Liwei Wang
    Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
  • Sonal Gandhi
    Division of Medical Oncology, 71545Sunnybrook Health Sciences Centre, Toronto, Canada.
  • Elham Sagheb Hossein Pour
    Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA.
  • Laurens P Janssens
    Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
  • Adriana M Delgado
    Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
  • Karen A Doering
    Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
  • Anup Kumar Mishra
  • Jason D Greenwood
    Department of Family Medicine, Mayo Clinic, Rochester, MN.
  • Hongfang Liu
    Department of Artificial Intelligence & Informatics, Mayo Clinic, Rochester, MN, United States.
  • Shounak Majumder
    Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA.