Determination of Marital Status of Patients from Structured and Unstructured Electronic Healthcare Data.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
PMID:

Abstract

Social Determinants of Health, including marital status, are becoming increasingly identified as key drivers of health care utilization. This paper describes a robust method to determine the marital status of patients using structured and unstructured electronic healthcare data from a single academic institution in the United States. We developed and validated a natural language processing pipeline (NLP) for the ascertainment of marital status from clinical notes and compared the performance against two baseline methods: a machine learning n-gram model, and structured data obtained from the electronic health record. Overall our NLP engine had excellent performance on both document-level (F1 0.97) and patient-level (F1 0.95) classification. The NLP Engine had superior performance compared with a baseline machine learning n-gram model. We also observed a good correlation between the marital status obtained from our NLP engine and the baseline structured electronic healthcare data (κ 0.6).

Authors

  • Brian T Bucher
    School of Medicine, University of Utah, Salt Lake City, Utah, US.
  • Jianlin Shi
    University of Utah, Salt Lake City, UT, USA.
  • Robert John Pettit
    University of Utah School of Medicine, Salt Lake City, UT, USA.
  • Jeffrey Ferraro
    Intermountain Healthcare, Salt Lake City, UT, USA.
  • Wendy W Chapman
    School of Medicine, University of Utah, Salt Lake City, Utah, US.
  • Adi Gundlapalli
    University of Utah School of Medicine, Salt Lake City, UT, USA.