Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen
Journal:
arXiv
Published Date:
May 27, 2025
Abstract
Traffic safety science has long been hindered by a fundamental data paradox:
the crashes we most wish to prevent are precisely those events we rarely
observe. Existing crash-frequency models and surrogate safety metrics rely
heavily on sparse, noisy, and under-reported records, while even sophisticated,
high-fidelity simulations undersample the long-tailed situations that trigger
catastrophic outcomes such as fatalities. We argue that the path to achieving
Vision Zero, i.e., the complete elimination of traffic fatalities and severe
injuries, requires a paradigm shift from traditional crash-only learning to a
new form of counterfactual safety learning: reasoning not only about what
happened, but also about the vast set of plausible yet perilous scenarios that
could have happened under slightly different circumstances. To operationalize
this shift, our proposed agenda bridges macro to micro. Guided by crash-rate
priors, generative scene engines, diverse driver models, and causal learning,
near-miss events are synthesized and explained. A crash-focused digital twin
testbed links micro scenes to macro patterns, while a multi-objective validator
ensures that simulations maintain statistical realism. This pipeline transforms
sparse crash data into rich signals for crash prediction, enabling the
stress-testing of vehicles, roads, and policies before deployment. By learning
from crashes that almost happened, we can shift traffic safety from reactive
forensics to proactive prevention, advancing Vision Zero.