Artificial intelligence for mortality risk stratification in septic shock: A systematic review and meta-analysis.

Journal: International journal of medical informatics
Published Date:

Abstract

BACKGROUND: Septic shock is associated with high mortality, making accurate risk stratification crucial for tailored treatment. Traditional clinical scoring systems are limited by static assessments and suboptimal accuracy. Artificial intelligence (AI) offers the potential for dynamic and personalized prediction; however, its clinical utility requires rigorous evaluation. OBJECTIVE: This systematic review and meta-analysis specifically evaluates the diagnostic accuracy of AI for mortality risk stratification in septic shock and compares their performance against traditional clinical scoring systems. METHODS: A systematic search of literature was performed across Embase, PubMed, the Cochrane Library, and Web of Science, covering studies up to September 17, 2025. Studies developing or validating AI for mortality risk stratification in adult septic shock patients were included. Quality was assessed using the PROBAST + AI. Pooled sensitivity, specificity, and the area under the curve (AUC) were calculated using a bivariate random-effects model. RESULTS: Thirteen studies involving 56,502 patients were included. AI achieved a pooled sensitivity of 0.65, specificity of 0.81, and an AUC of 0.80 (Confidence Interval [CI]: 0.77-0.84). Notably, recurrent neural networks (RNN) and support vector machines (SVM) achieved the highest AUC of 0.91. Compared to traditional clinical scoring systems, AI demonstrated significantly higher specificity (0.81 vs. 0.66, P < 0.05) and AUC (0.80 vs. 0.69, P < 0.05). The most frequently utilized algorithm was logistic regression. No significant performance difference was observed between internal and external validation sets (AUC 0.80 vs. 0.78, P > 0.05). Substantial heterogeneity existed among the included studies, potentially arising from variations in AI algorithms, number of centers, septic shock definitions, data sources, mortality endpoints, clinical departments, and data splitting methods. CONCLUSIONS: AI demonstrates promising discrimination and higher specificity than traditional clinical scoring systems. However, the evidence remains limited by substantial heterogeneity and potential bias,and its clinical utility requires further validation through prospective, multicenter studies.

Authors

Keywords

No keywords available for this article.