Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning.

Journal: American journal of epidemiology
Published Date:

Abstract

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.

Authors

  • David S Carrell
    Group Health Research Institute, Seattle, WA, 98101, USA.
  • Susan Gruber
    Innovation in Medical Evidence Development and Surveillance (IMEDS), Reagan-Udall Foundation for the FDA, Washington, District of Columbia.
  • James S Floyd
  • Maralyssa A Bann
  • Kara L Cushing-Haugen
  • Ron L Johnson
  • Vina Graham
  • David J Cronkite
    Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA.
  • Brian L Hazlehurst
  • Andrew H Felcher
  • Cosmin A Bejan
    Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.
  • Adee Kennedy
    Harvard Medical School and Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, Massachusetts, USA.
  • Mayura U Shinde
  • Sara Karami
  • Yong Ma
    Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA.
  • Danijela Stojanovic
  • Yueqin Zhao
  • Robert Ball
    Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States.
  • Jennifer C Nelson