Cross-institution natural language processing for reliable clinical association studies: a methodological exploration.

Journal: Journal of clinical epidemiology

Published Date: Jan 14, 2024

Abstract

OBJECTIVES: Natural language processing (NLP) of clinical notes in electronic medical records is increasingly used to extract otherwise sparsely available patient characteristics, to assess their association with relevant health outcomes. Manual data curation is resource intensive and NLP methods make these studies more feasible. However, the methodology of using NLP methods reliably in clinical research is understudied. The objective of this study is to investigate how NLP models could be used to extract study variables (specifically exposures) to reliably conduct exposure-outcome association studies.

Authors

Madhumita Sushil

Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA.
Atul J Butte

Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA.
Ewoud Schuit

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
Maarten van Smeden

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands.
Artuur M Leeuwenberg

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands. Electronic address: a.m.leeuwenberg-15@umcutrecht.nl.

Keywords

Electronic Health Records Female Humans Intensive Care Units Male Middle Aged Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (38219811)

Cross-institution natural language processing for reliable clinical association studies: a methodological exploration.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals