Which approach better samples extreme traffic conflicts? Conventional- vs. machine learning-based sampling methods.

Journal: Accident; analysis and prevention
Published Date:

Abstract

Extreme value theory has been receiving much attention of late for proactively estimating crash risk through a two-step procedure that first samples extreme traffic conflicts and then estimates crash risk based on those sampled extremes. Although the existing body of research has encapsulated sampling methods within a predominant conventional technique, there is no universally accepted practice on how to efficiently select threshold values, nor on how to evaluate the sampled extreme conflicts alignment with the conceptual crash severity level framework. This research aims to address these issues by employing machine learning-based sampling methods, which do not require predefined thresholds, and by comparing the sampled extremes with the conceptual severity levels, to assess their alignment. After a review of recent developments in machine learning techniques in transportation and other engineering fields, two promising machine learning sampling models, autoencoder neural network and isolation forest, were investigated using a database of vehicle-to-pedestrian conflicts at urban signalized intersections. Sampled extreme conflicts using the machine learning and conventional sampling techniques-as a baseline -were assessed and compared using two criteria: their visual alignment with the conceptual severity level framework, and their compatibility with the extreme value distribution. The results demonstrate that the extreme conflicts selected based on the machine learning methods better mirror the conceptual severity levels than the conventional sampling technique. Moreover, extremes classified by the isolation forest more closely preserve the characteristics of the empirical tail distributions, demonstrating a better contextual representation for modeling with the extreme value distribution compared to the autoencoder neural network and conventional sampling methods.

Authors

Keywords

No keywords available for this article.