Proposing New RadLex Terms by Analyzing Free-Text Mammography Reports.

Journal: Journal of digital imaging

Published Date: Oct 1, 2018

Abstract

After years of development, the RadLex terminology contains a large set of controlled terms for the radiology domain, but gaps still exist. We developed a data-driven approach to discover new terms for RadLex by mining a large corpus of radiology reports using natural language processing (NLP) methods. Our system, developed for mammography, discovers new candidate terms by analyzing noun phrases in free-text reports to extend the mammography part of RadLex. Our NLP system extracts noun phrases from free-text mammography reports and classifies these noun phrases as "Has Candidate RadLex Term" or "Does Not Have Candidate RadLex Term." We tested the performance of our algorithm using 100 free-text mammography reports. An expert radiologist determined the true positive and true negative RadLex candidate terms. We calculated precision/positive predictive value and recall/sensitivity metrics to judge the system's performance. Finally, to identify new candidate terms for enhancing RadLex, we applied our NLP method to 270,540 free-text mammography reports obtained from three academic institutions. Our method demonstrated precision/positive predictive value of 0.77 (159/206 terms) and a recall/sensitivity of 0.94 (159/170 terms). The overall accuracy of the system is 0.80 (235/293 terms). When we ran our system on the set of 270,540 reports, it found 31,800 unique noun phrases that are potential candidates for RadLex. Our data-driven approach to mining radiology reports can identify new candidate terms for expanding the breast imaging lexicon portion of RadLex and may be a useful approach for discovering new candidate terms from other radiology domains.

Authors

Hakan Bulu

Department of Radiology and Department of Biomedical Data Science, Medical School Office Building (MSOB), Stanford University, 1265 Welch Road, X383, Stanford, CA, 94305-5464, USA.
Dorothy A Sippo

Department of Radiology, Avon Comprehensive Breast Evaluation Center, Massachusetts General Hospital, Wang Ambulatory Care Building, Suite 240, 15 Parkman Street, Boston, MA, 02114, USA. dsippo@mgh.harvard.edu.
Janie M Lee

Department of Radiology, Seattle Cancer Care Alliance, University of Washington, 825 Eastlake Avenue East, Suite G2-600, Seattle, WA, 98109, USA.
Elizabeth S Burnside

Department of Radiology, University of Wisconsin, Madison, WI, United States.
Daniel L Rubin

Department of Biomedical Data Science, Stanford University School of Medicine Medical School Office Building, Stanford CA 94305-5479.

Keywords

Female Humans Mammography Natural Language Processing Radiology Information Systems Research Report Vocabulary, Controlled

External Resources

View on PubMed Access via DOI PubMed (29560542)

Proposing New RadLex Terms by Analyzing Free-Text Mammography Reports.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals