Interpretable machine learning for time-to-event prediction in medicine and healthcare.

Journal: Artificial intelligence in medicine
PMID:

Abstract

Time-to-event prediction, e.g. cancer survival analysis or hospital length of stay, is a highly prominent machine learning task in medical and healthcare applications. However, only a few interpretable machine learning methods comply with its challenges. To facilitate a comprehensive explanatory analysis of survival models, we formally introduce time-dependent feature effects and global feature importance explanations. We show how post-hoc interpretation methods allow for finding biases in AI systems predicting length of stay using a novel multi-modal dataset created from 1235 X-ray images with textual radiology reports annotated by human experts. Moreover, we evaluate cancer survival models beyond predictive performance to include the importance of multi-omics feature groups based on a large-scale benchmark comprising 11 datasets from The Cancer Genome Atlas (TCGA). Model developers can use the proposed methods to debug and improve machine learning algorithms, while physicians can discover disease biomarkers and assess their significance. We contribute open data and code resources to facilitate future work in the emerging research direction of explainable survival analysis.

Authors

  • Hubert Baniecki
    MI2DataLab, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
  • Bartlomiej Sobieski
    University of Warsaw, Warsaw, Poland; Warsaw University of Technology, Warsaw, Poland.
  • Patryk Szatkowski
    Warsaw University of Technology, Warsaw, Poland; Medical University of Warsaw, Warsaw, Poland.
  • Przemyslaw Bombinski
    Warsaw University of Technology, Warsaw, Poland; Medical University of Warsaw, Warsaw, Poland.
  • Przemysław Biecek
    Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.