Cannabis Use Documentation within the Electronic Health Record: A Use Case for Natural Language Processing Methods

Journal: medRxiv
Published Date:

Abstract

Introduction: Recreational and medical cannabis use (CU) information is often available within the electronic health record (EHR) in a format that is impractical for health care provider use. Transformation of free-text EHR documentation in notes to discrete elements is possible using natural language processing (NLP) and has the potential to characterize CU efficiently. The objective of this study was to develop an NLP algorithm to identify documentation of CU within EHR unstructured clinical notes. Methods: We identified EHR notes with cannabis-related terminologies through a keyword search among all Geisinger patients with at least one encounter between 1/1/2013 and 6/30/2022. We trained four NLP models to classify notes into six categories based on time, context, and reliability of CU documentation identified through manual annotation. We compared the demographic characteristics of patients with positive classification for CU using the best-performing model to those of the overall population. Results: Of the over 1.7 million eligible patients, 150,726 (8.6%) were flagged as cannabis users. The Bio-ClinicalBERT, a transformer-based NLP model, achieved close to human performance in classifying CU (weighted Precision=91.4, Recall=93.3, F-score=92.4). Cannabis users had higher BMI and were at least nine-fold more likely to use tobacco, alcohol, and illicit substances. Conclusion: Our study evaluated the prevalence of CU documentation across the entire corpus of EHR notes data without population segmentation. The NLP methodologies used achieved performance close to that of human annotation and laid the foundation for identifying and classifying CU within unstructured data sources, with future applications in research and patient care.

Authors

  • Pradhan
  • A. M.; Shetty
  • V. A.; Gregor
  • C.; Graham
  • J. H.; Tusing
  • L.; Hirsch
  • A. G.; Hall
  • E.; Troiani
  • V.; Davis
  • M. P.; Bieler
  • D. L.; Romagnoli
  • K. M.; Kraus
  • C. K.; Piper
  • B. J.; Wright
  • E. A.