De Novo Natural Language Processing Algorithm Accurately Identifies Myxofibrosarcoma From Pathology Reports.

Journal: Clinical orthopaedics and related research
PMID:

Abstract

BACKGROUND: Available codes in the ICD-10 do not accurately reflect soft tissue sarcoma diagnoses, and this can result in an underrepresentation of soft tissue sarcoma in databases. The National VA Database provides a unique opportunity for soft tissue sarcoma investigation because of the availability of all clinical results and pathology reports. In the setting of soft tissue sarcoma, natural language processing (NLP) has the potential to be applied to clinical documents such as pathology reports to identify soft tissue sarcoma independent of ICD codes, allowing sarcoma researchers to build more comprehensive databases capable of answering a myriad of research questions.

Authors

  • Sarah E Lindsay
    Department of Orthopaedics and Rehabilitation, Oregon Health & Science University, Portland, OR, USA.
  • Cecelia J Madison
    Portland VA Medical Center, Portland, OR, USA.
  • Duncan C Ramsey
    Department of Orthopaedics and Rehabilitation, Oregon Health & Science University, Portland, OR, USA.
  • Yee-Cheen Doung
    Department of Orthopaedics and Rehabilitation, Oregon Health & Science University, Portland, OR, USA.
  • Kenneth R Gundle
    Oregon Health and Science University, Department of Orthopaedics and Rehabilitation Portland VA Medical Center, Operative Care Division, Portland, OR, USA.