External validation of AI-based scoring systems in the ICU: a systematic review and meta-analysis.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Machine learning (ML) is increasingly used to predict clinical deterioration in intensive care unit (ICU) patients through scoring systems. Although promising, such algorithms often overfit their training cohort and perform worse at new hospitals. Thus, external validation is a critical - but frequently overlooked - step to establish the reliability of predicted risk scores to translate them into clinical practice. We systematically reviewed how regularly external validation of ML-based risk scores is performed and how their performance changed in external data.

Authors

  • Patrick Rockenschaub
    CLAIM - Charité Lab for AI in Medicine, Charité - Universitätsmedizin Berlin, Berlin, Germany.
  • Ela Marie Akay
    CLAIM - Charité Lab for AI in Medicine, Charité - Universitätsmedizin Berlin, Berlin, Germany.
  • Benjamin Gregory Carlisle
    STREAM - Studies of Translation, Ethics and Medicine, School of Population and Global Health, McGill University, Montréal, Canada.
  • Adam Hilbert
    CLAIM - Charité Lab for Artificial Intelligence in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany.
  • Joshua Wendland
    Chair for Artificial Intelligence and Formal Methods, Faculty of Computer Science, Ruhr University, Bochum, Germany.
  • Falk Meyer-Eschenbach
    Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Berlin, Germany.
  • Anatol-Fiete Näher
    Digital Global Public Health, Hasso Plattner Institute for Digital Engineering, University of Potsdam, Potsdam, Germany.
  • Dietmar Frey
    CLAIM - Charité Lab for Artificial Intelligence in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany.
  • Vince Istvan Madai
    Charité Lab for Artificial Intelligence in Medicine-CLAIM, Charité - Universitätsmedizin Berlin, Berlin, Germany.