Predictive Multiplicity in Survival Models: A Method for Quantifying Model Uncertainty in Predictive Maintenance Applications
Journal:
arXiv
Published Date:
Apr 16, 2025
Abstract
In many applications, especially those involving prediction, models may yield
near-optimal performance yet significantly disagree on individual-level
outcomes. This phenomenon, known as predictive multiplicity, has been formally
defined in binary, probabilistic, and multi-target classification, and
undermines the reliability of predictive systems. However, its implications
remain unexplored in the context of survival analysis, which involves
estimating the time until a failure or similar event while properly handling
censored data. We frame predictive multiplicity as a critical concern in
survival-based models and introduce formal measures -- ambiguity, discrepancy,
and obscurity -- to quantify it. This is particularly relevant for downstream
tasks such as maintenance scheduling, where precise individual risk estimates
are essential. Understanding and reporting predictive multiplicity helps build
trust in models deployed in high-stakes environments. We apply our methodology
to benchmark datasets from predictive maintenance, extending the notion of
multiplicity to survival models. Our findings show that ambiguity steadily
increases, reaching up to 40-45% of observations; discrepancy is lower but
exhibits a similar trend; and obscurity remains mild and concentrated in a few
models. These results demonstrate that multiple accurate survival models may
yield conflicting estimations of failure risk and degradation progression for
the same equipment. This highlights the need to explicitly measure and
communicate predictive multiplicity to ensure reliable decision-making in
process health management.