Unveiling social determinants of health impact on adverse pregnancy outcomes through natural language processing.

Journal: Scientific reports
Published Date:

Abstract

Understanding the role of Social Determinants of Health (SDoH) in pregnancy outcomes is critical for improving maternal and infant health yet extracting SDoH from unstructured electronic health records remains challenging. We trained and evaluated natural language processing (NLP) models for SDoH extraction from clinical notes in the MIMIC-III database (86 notes), and externally evaluated them on the MIMIC-IV database (171 notes) to assess generalizability. Focusing on social support, occupation, and substance use, we compared rule-based, word embedding, and contextual language models. The ClinicalBERT model with decision tree classifier achieved the highest performance for social support (F1: 0.92), while keyword processing excelled for occupation (F1: 0.74), and word embeddings with random forest performed best for substance use (F1: 0.83). Logistic regression revealed significant associations between pregnancy complications and both substance use (OR 6.47, p < 0.001) and social support (OR 0.07, p < 0.001). Our study demonstrates the feasibility of NLP for SDoH extraction and underscores their clinical relevance in maternal health.

Authors

  • Nidhi Soley
    Institute for Computational Medicine, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, United States.
  • MaKhaila Bentil
    Department of Computer Science Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
  • Jash Shah
    Department of Computer Science Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
  • Masoud Rouhizadeh
    Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.
  • Casey Overby Taylor
    Johns Hopkins Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, United States of America.