Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

Journal: arXiv

Published Date: Feb 26, 2025

Abstract

Machine learning systems trained on electronic health records (EHRs) increasingly guide treatment decisions, but their reliability depends on the critical assumption that patients follow the prescribed treatments recorded in EHRs. Using EHR data from 3,623 hypertension patients, we investigate how treatment non-adherence introduces implicit bias that can fundamentally distort both causal inference and predictive modeling. By extracting patient adherence information from clinical notes using a large language model (LLM), we identify 786 patients (21.7%) with medication non-adherence. We further uncover key demographic and clinical factors associated with non-adherence, as well as patient-reported reasons including side effects and difficulties obtaining refills. Our findings demonstrate that this implicit bias can not only reverse estimated treatment effects, but also degrade model performance by up to 5% while disproportionately affecting vulnerable populations by exacerbating disparities in decision outcomes and model error rates. This highlights the importance of accounting for treatment non-adherence in developing responsible and equitable clinical machine learning systems.

Authors

Zhongyuan Liang
Arvind Suresh
Irene Y. Chen

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2502.19625v2)

Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals