Predicting the need for medical care after toxin exposure using SHAP-interpretable gradient boosting

Journal: medRxiv
Published Date:

Abstract

Objective: Experts in poison control centers must accurately and efficiently assess the severity of an exposure, neither delaying care nor pointlessly sending patients to the hospital, using only the information given during a first phone call. To help healthcare professionals (HP) make these difficult decisions, we developed and evaluated a machine learning-based algorithm that predicts whether a patient should seek medical help or not, based solely on the information provided during their first call to the poison control center, for all kinds of mono-intoxications. Methods: We extracted data recorded by clinicians at the Lyon PCC between 2000 and 2025. Cases with missing original recommendations were excluded. We trained and compared several machine-learning models, emphasizing decision-tree-based and gradient-boosted tree approaches. Two classification tasks were defined: (1) binary triage (recommend emergency or non-emergency healthcare facility vs. stay at home) and (2) three-class triage (stay at home / non-emergency healthcare facility / emergency healthcare facility). Missing data were left as-is. Cross-validation and bootstrapping were used to ensure stable and statistically significant results. Model explainability was assessed with SHAP to identify the most important features for predictions. Model performance was evaluated using F1-score and ROC AUC; class imbalance was addressed during training. We compared our results to published algorithms that focus on single-substance intoxications. Results: After processing, 220,825 cases remained. Recommended dispositions were: stay at home 66.6%, emergency facility 25.4%, and non-emergency facility 7.4%. For the binary task, XGBoost achieved the best performance (F1 = 0.748; ROC AUC = 0.820). For the three-class task, XGBoost again performed best (macro F1 = 0.657; multiclass ROC AUC = 0.859). The delay from exposure to call, SNOMED symptom codes, and the circumstance of exposure were the most influential features. Our results were competitive with algorithms focusing on intoxication due to a single substance. Conclusion: Gradient-boosted tree models can produce accurate, interpretable, and clinically relevant predictions of poisoning severity from routine PCC data. With external validation and prospective testing, such tools could complement expert judgment to improve triage consistency and patient outcomes.

Authors

  • Lerogeron
  • H.; Gueguen
  • L.; Chary
  • M.; Nguyen
  • K. A.

Categories