Optimizing CNN for pavement distress detection via edge-enhanced multi-scale feature fusion.
Journal:
PloS one
Published Date:
Apr 9, 2025
Abstract
Traditional crack detection methods initially relied on manual observation, followed by instrument-assisted techniques. Today, road surface inspection leverages deep learning to achieve automated crack detection. However, in the domain of deep learning-based road surface damage classification, the heterogeneous and complex nature of road environments introduces significant background noise and unstructured features. These factors often undermine the robustness and generalization capability of models, thereby adversely affecting classification accuracy. To address this challenge, this research incorporates edge priors by integrating traditional edge detection techniques with deep convolutional neural networks (DCNNs). This paper proposes an innovative mechanism called Edge-Enhanced Multi-Scale Feature Fusion (EE-MSFF), which enhances edge information through multi-scale feature extraction, thereby mitigating the impact of complex backgrounds and improving the model's focus on crack regions. Specifically, the proposed mechanism leverages classical edge detection operators such as Sobel, Prewitt, and Laplacian to perform multi-scale edge information extraction during the feature extraction phase of the model. This process captures both local edge features and global structural information in crack regions, thereby enhancing the model's resistance to interference from complex backgrounds. By employing multi-scale receptive fields, the EE-MSFF mechanism facilitates hierarchical fusion of feature maps, guiding the model to learn edge information that is correlated with crack regions. This effectively strengthens the model's ability to perceive damaged pavement features in complex environments, improving classification accuracy and stability. In this study, the model underwent systematic training and validation on both the complex-background dataset RDD2020 and the simple-background dataset Concrete_Data_Week3. Experimental results demonstrate that the proposed model achieved a classification accuracy of 88.68% on the RDD2020 dataset and 99.5% on the Concrete_Data_Week3 dataset, where background interference is minimal. Furthermore, ablation studies were conducted to analyze the independent contributions of each module, highlighting the performance improvements associated with the integration of multi-scale edge features.