A Computable Phenotype for Acute Respiratory Distress Syndrome Using Natural Language Processing and Machine Learning.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium

Published Date: Dec 5, 2018

Abstract

Acute Respiratory Distress Syndrome (ARDS) is a syndrome of respiratory failure that may be identified using text from radiology reports. The objective of this study was to determine whether natural language processing (NLP) with machine learning performs better than a traditional keyword model for ARDS identification. Linguistic pre-processing of reports was performed and text features were inputs to machine learning classifiers tuned using 10-fold cross-validation on 80% of the sample size and tested in the remaining 20%. A cohort of 533 patients was evaluated, with a data corpus of 9,255 radiology reports. The traditional model had an accuracy of 67.3% (95% CI: 58.3-76.3) with a positive predictive value (PPV) of 41.7% (95% CI: 27.7-55.6). The best NLP model had an accuracy of 83.0% (95% CI: 75.9-90.2) with a PPV of 71.4% (95% CI: 52.1-90.8). A computable phenotype for ARDS with NLP may identify more cases than the traditional model.

Authors

Majid Afshar

Loyola University Chicago, Chicago, IL.
Cara Joyce

Loyola University Chicago, Chicago, IL.
Anthony Oakey

Department of Computer Science, Loyola University Chicago, Chicago, IL.
Perry Formanek

Department of Medicine, Loyola University Medical Center, Maywood, IL.
Philip Yang

Department of Medicine, Loyola University Medical Center, Maywood, IL.
Matthew M Churpek

Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States.
Richard S Cooper

Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, IL.
Susan Zelisko

Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, IL.
Ron Price

Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, IL.
Dmitriy Dligach

Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, IL.

Keywords

Adult Aged Area Under Curve Cohort Studies Diagnosis, Computer-Assisted Electronic Health Records Female Humans Length of Stay Male Middle Aged Natural Language Processing Predictive Value of Tests Radiography, Thoracic Respiratory Distress Syndrome Risk Factors Supervised Machine Learning Unified Medical Language System

External Resources

View on PubMed PubMed (30815053)

A Computable Phenotype for Acute Respiratory Distress Syndrome Using Natural Language Processing and Machine Learning.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals