AI Bias and Confounding Risk in Health Feature Engineering for Machine Learning Classification Task.
Journal:
Studies in health technology and informatics
Published Date:
Aug 7, 2025
Abstract
Recent advancements in machine learning bring unique opportunities in health fields but also pose considerable challenges. Due to stringent ethical considerations and resource constraints, health data can vary in scope, population coverage, and collection granularity, prone to different AI bias and confounding risks in the performance of a classification task. This experimental study explored the impact on hidden confounding risk of model performance in a cardiovascular readmission prediction task using real-life health data from 'Data-derived Risk assessment using the Electronic medical record through Application of Machine Learning' (DREAM). Five commonly used machine learning models-k-nearest neighbors (KNN), random forest (RF), decision tree (DT), Catboost and Xgboost-were selected for this task. Model performance was assessed via the area under the receiver operating characteristics curve (AUC) and F1 score, both before and after propensity score adjustment. Based on density plot comparison of the adjustment, the difference mainly contributed from patients aged 20 and 40. High fluctuation on the model performance has been noted by including and excluding patients under this age group. After reasoning, high-risk pregnant females may serve as a confounding factor in the original model generation. The pregnancy rate in the non-readmitted group is significantly higher than that in the readmitted group (x2 = 10.2, p < 0.001). However, pregnant status required additional information query from a different hospital system. Without carefully consideration of confounding risks, traditional pipeline may generate a less robotic classifier in the clinical setting. Incorporating propensity score matching could be a solution to randomise invisible confounding factors between the classes.