Cross-institution natural language processing for reliable clinical association studies: a methodological exploration.
Journal:
Journal of clinical epidemiology
Published Date:
Jan 14, 2024
Abstract
OBJECTIVES: Natural language processing (NLP) of clinical notes in electronic medical records is increasingly used to extract otherwise sparsely available patient characteristics, to assess their association with relevant health outcomes. Manual data curation is resource intensive and NLP methods make these studies more feasible. However, the methodology of using NLP methods reliably in clinical research is understudied. The objective of this study is to investigate how NLP models could be used to extract study variables (specifically exposures) to reliably conduct exposure-outcome association studies.