Predicting Risk of Transfusion-Induced Red Blood Cell Alloimmunization Using Statistical and Machine Learning Approaches in the Recipient Epidemiology and Donor Evaluation Study (REDS-III) Database
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
Red blood cell (RBC) alloimmunization is a common complication from blood transfusion, often resulting in accelerated donor RBC destruction. Patients show substantial variation in their predisposition to RBC alloimmunization. Previous studies have identified several risk factors, but to our knowledge, there have been no studies that predict risk of RBC alloimmunization by modeling multiple potential risk factors simultaneously. Here, our study represents the first attempt to build prediction models for RBC alloimmunization using the large sample size and rich set of potential risk factors available in the Recipient Epidemiology and Donor Evaluation Study (REDS-III) recipient database. To develop the prediction models, we applied a range of approaches, including traditional statistical models (logistic regression), and modern machine learning (including gradient boosting, random forest, and XGBoost), deep learning (the multilayer perceptron method), and large language models (LLM). XGBoost demonstrates the overall best performance among models providing uncertainty quantification (F1= 0.672 and area under the ROC curve [AUC-ROC]=0.752). LLMs show promising results with the best F1 scores (0.677-0.687), though they are limited by their inability to provide uncertainty estimates, they hold the potential for use as an interactive chatbot for patients. Although there is ample room for performance improvement, limiting the analysis to patients predicted with >80% confidence by XGBoost resulted in a substantially improved AUC-ROC of 0.919, which can be of potential clinical significance.