Show and tell: A critical review on robustness and uncertainty for a more responsible medical AI.

Journal: International journal of medical informatics
Published Date:

Abstract

This critical review explores two interrelated trends: the rapid increase in studies on machine learning (ML) applications within health informatics and the growing concerns about the reproducibility of these applications across different healthcare settings. Addressing these concerns necessitates acknowledging the uncertainty inherent in evaluating medical decision support systems. Therefore, we emphasize the importance of external validation and robustness assessment of the underlying ML models to better estimate their performance across diverse real-world scenarios. To raise awareness among health practitioners and ML researchers, we advocate for the widespread adoption of external validation practices and uncertainty quantification techniques. Our survey of specialized literature reveals that fewer than 4% of studies published in high-impact medical informatics journals over the past 13 years have validated their systems using data from settings different from those that provided the training data. This low percentage is incompatible with responsible research, given the potential risks posed by unreliable ML models in healthcare. Raising the standards for medical AI evaluation is crucial to improving practitioners' understanding of the potential and limitations of decision support systems in real-world settings. It is essential that uncertainty is not hidden in studies aimed at advancing knowledge in this field.

Authors

  • Luca Marconi
    Department of Computer Science, Systems and Communication, University of Milano-Bicocca, Milan, Italy.
  • Federico Cabitza
    Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, Italy.