Improving ACS prediction in T2DM patients by addressing false records in electronic medical records using propensity score.

Journal: Scientific reports

Published Date: May 28, 2025

Abstract

Our study aims to improve the prediction performance of machine learning (ML) models by addressing false records (i.e., false positive, false negative, or missingness) in binary categorical variables in electronic medical records (EMRs) using propensity score (PS). This study used the EMRs of patients with type 2 diabetes mellitus (T2DM) treated with basal insulin at a tertiary university hospital in South Korea. We expanded the definition of PS into the probability of having a record for a binary variable given covariates. We calculated PS for the binary categorical variables in their EMRs and developed PS datasets. By utilizing various ML algorithms, we developed and validated ACS prediction models on 80% and 20% of the dataset, respectively. We evaluated model performance using accuracy, recall, precision, F1 score, and AUROC. Additionally, the Shapley Additive Explanation (SHAP) method was used to identify important clinical predictors of ACS. The study included 9,338 patients (with an average age of 60.2 years and 56.6% of whom were male) over 10,184 treatment periods. The most prevalent comorbidities were hypertension (31.5%) and dyslipidemia (28.9%). Notably, 6.9% experienced ACS during their insulin treatment. The ML models trained on PS datasets generally outperformed the models trained on raw datasets. The results of SHAP analysis showed that older age, higher baseline weight, higher baseline glucose, history of antithrombotic therapy, history of chest pain, and indicators of T2DM progression (e.g., senile cataract) were important ACS risk factors. We have developed an ACS prediction model with an improved performance and higher reliance on clinical predictors that are in alignment with current medical understanding.

Authors

David Seung U Lee

Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, 08826, South Korea.
Jung-Hyun Won

Center for Convergence Approaches in Drug Development, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea; Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea.
Howard Lee

Department of Applied Biomedical Engineering, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea; Center for Convergence Approaches in Drug Development, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea; Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea; Department of Clinical Pharmacology and Therapeutics, Seoul National University College of Medicine and Hospital, Seoul, Korea; Advanced Institute of Convergence Technology, Suwon, Korea. Electronic address: howardlee@snu.ac.kr.

Keywords

Aged Algorithms Comorbidity Diabetes Mellitus, Type 2 Electronic Health Records Female Humans Machine Learning Male Middle Aged Propensity Score Republic of Korea

External Resources

View on PubMed Access via DOI PubMed (40437059)

Improving ACS prediction in T2DM patients by addressing false records in electronic medical records using propensity score.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Improving ACS prediction in T2DM patients by addressing false records in electronic medical records using propensity score.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals