Enhancing Patient-Physician Communication: Simulating African American Vernacular English in Medical Diagnostics with Large Language Models.

Journal: Journal of healthcare informatics research
Published Date:

Abstract

UNLABELLED: Effective communication is crucial in reducing health disparities. However, linguistic differences, such as African American Vernacular English (AAVE), can lead to communication gaps between patients and physicians, negatively affecting care and outcomes. This study examines whether large language models (LLMs), specifically GPT-4 and Llama 3.3, can replicate AAVE in simulated clinical dialogues to improve cultural sensitivity. We tested four prompt types-BaseP, DemoP, LingP, and CompP-using United States Medical Licensing Examination (USMLE) case simulations. Statistical analyses on the models' outputs showed a significant difference among prompt types for both GPT-4 ((2,70) = 6.218,  = 0.003) and Llama 3.3 ((2,70) = 12.124,  < 0.001), indicating that including demographic information and/or explicit AAVE cues influences each model's output. Combining demographic and linguistic cues (CompP) yielded the highest mean AAVE feature counts (e.g., 9.83 for GPT-4 vs. 16.06 for Llama 3.3), although neither model fully captured the diversity of AAVE. Moreover, simply mentioning African American demographics triggers extra informal forms, suggesting built-in stereotypes or biases in both models. Overall, these findings highlight the promise of LLMs for culturally sensitive healthcare communication, while underscoring the need for continued refinement to address stereotypes and more accurately represent diverse linguistic styles.

Authors

  • Yeawon Lee
    Drexel University, Philadelphia, PA 19104 USA.
  • Chia-Hsuan Chang
    Yale University, New Haven, CT 06510 USA.
  • Christopher C Yang
    Drexel University, Philadelphia, PA 19104 USA.

Keywords

No keywords available for this article.