A Question-and-Answer System to Extract Data From Free-Text Oncological Pathology Reports (CancerBERT Network): Development Study.

Journal: Journal of medical Internet research
Published Date:

Abstract

BACKGROUND: Information in pathology reports is critical for cancer care. Natural language processing (NLP) systems used to extract information from pathology reports are often narrow in scope or require extensive tuning. Consequently, there is growing interest in automated deep learning approaches. A powerful new NLP algorithm, bidirectional encoder representations from transformers (BERT), was published in late 2018. BERT set new performance standards on tasks as diverse as question answering, named entity recognition, speech recognition, and more.

Authors

  • Joseph Ross Mitchell
    Department of Machine Learning, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Phillip Szepietowski
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Rachel Howard
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Phillip Reisman
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Jennie D Jones
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Patricia Lewis
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Brooke L Fridley
    Department of Biostatistics and Bioinformatics, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.
  • Dana E Rollison
    Department of Health Data Services, H Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States.