FUSION-CDW: Automated curation of scalable datasets for enhancing construction and demolition waste valorisation.

Journal: Waste management (New York, N.Y.)
Published Date:

Abstract

Construction and demolition waste (CDW) is a significant waste stream. However, in material recovery facilities (MRFs), CDW still relies heavily on manual sorting, which limits throughput and waste valorisation. Vision-enabled automation can improve CDW recycling, but robust performance requires large, site-representative datasets. Existing datasets present limited or conveyor-belt scenarios, usually requiring labour-intensive annotation. This paper proposes Fusion-Construction and Demolition Waste (FUSION-CDW), an automated dataset curation pipeline that fuses CDW instances onto 4 K site backgrounds through random placement with controllable overlap ratio, size distribution, and clutter levels. Using an 18,000-instance library spanning 12 material categories, FUSION-CDW generates 52,000 high-resolution images, organised into seven training subsets representing different regimes, alongside fixed, leakage-free validation and test splits. To quantify the visual complexity of the dataset, four complementary coefficients are introduced: the overlap complexity coefficient (OC), the clutter level coefficient (CC), the size complexity coefficient (SC), and the illumination complexity coefficient (IC). Compared with existing benchmarks, FUSION-CDW achieves top visual complexity, particularly in size complexity, indicating a fragment-dominated scenario similar to most MRF conditions. This study demonstrates that optimising training-set regimes (overlap, clutter, and size) is as vital as model selection for improving detection performance. The results contribute practical guidance for dataset curation, enabling more robust perception systems capable of handling the complex variability inherent in industrial MRF operations.

Authors

Keywords

No keywords available for this article.