ConcreteCARB: A comprehensive image dataset of concrete carbonation for computer vision tasks.
Journal:
Data in brief
Published Date:
Jan 20, 2026
Abstract
The ConcreteCARB dataset provides a comprehensive repository of 903 high-resolution images of concrete surfaces evaluated using the phenolphthalein test for carbonation detection. This data was collected under controlled laboratory conditions and aims to support artificial intelligence applications in civil engineering, especially in structural health monitoring tasks. The images are systematically organized into two distinct classes: "Carbonated Samples" and "No Carbonation Presence," enabling binary classification approaches. All samples were manually tested, split, and visually labelled by expert engineers to ensure reliable ground-truth classification, in accordance with standardized procedures. The dataset includes images of concrete prism elements fabricated with varying mix designs, incorporating different water-cement ratios and additives, such as industrial silica waste and natural admixtures derived from Opuntia ficus-indica. The specimens were subjected to natural atmospheric carbonation conditions for 180 days, and their carbonation fronts were revealed by phenolphthalein staining. The samples were then split manually with a chisel and hammer, and photographic documentation was performed with a Samsung SM-S901U1 smartphone using predefined settings to ensure consistency and quality across the dataset. ConcreteCARB is intended for researchers, engineers, and data scientists working on machine learning, deep learning, and computer vision solutions for concrete diagnostics. It provides valuable training and benchmarking data for the development of automated detection, classification, and segmentation models for carbonation damage assessment. Furthermore, the dataset can serve as a foundational tool for cross-comparative studies on the efficacy of AI techniques in materials degradation analysis. The openly accessible nature of the dataset through a public repository supports reproducibility and encourages the extension of AI applications in concrete durability and sustainability studies.
Authors
Keywords
No keywords available for this article.