Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: We implement 2 different multitask learning (MTL) techniques, hard parameter sharing and cross-stitch, to train a word-level convolutional neural network (CNN) specifically designed for automatic extraction of cancer data from unstructured text in pathology reports. We show the importance of learning related information extraction (IE) tasks leveraging shared representations across the tasks to achieve state-of-the-art performance in classification accuracy and computational efficiency.

Authors

  • Mohammed Alawad
    Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
  • Shang Gao
    Department of Orthopedics, Orthopedic Center of Chinese PLA, Southwest Hospital, Third Military Medical University, Chongqing, 400038, P.R.China.
  • John X Qiu
  • Hong Jun Yoon
    Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
  • J Blair Christian
    Biomedical Sciences, Engineering, and Computing Group, Health Data Science Institute, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
  • Lynne Penberthy
    Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland, USA.
  • Brent Mumphrey
    Louisiana Tumor Registry, Louisiana State University Health Sciences Center School of Public Health, New Orleans, Louisiana, USA.
  • Xiao-Cheng Wu
    Department of Epidemiology, Louisiana State University New Orleans School of Public Health, New Orleans, LA 70112, United States.
  • Linda Coyle
    Information Management Services Inc, Calverton, Maryland, USA.
  • Georgia Tourassi
    Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.