RoofNet: A Global Multimodal Dataset for Roof Material Classification
Journal:
arXiv
Published Date:
May 25, 2025
Abstract
Natural disasters are increasing in frequency and severity, causing hundreds
of billions of dollars in damage annually and posing growing threats to
infrastructure and human livelihoods. Accurate data on roofing materials is
critical for modeling building vulnerability to natural hazards such as
earthquakes, floods, wildfires, and hurricanes, yet such data remain
unavailable. To address this gap, we introduce RoofNet, the largest and most
geographically diverse novel multimodal dataset to date, comprising over 51,500
samples from 184 geographically diverse sites pairing high-resolution Earth
Observation (EO) imagery with curated text annotations for global roof material
classification. RoofNet includes geographically diverse satellite imagery
labeled with 14 key roofing types -- such as asphalt shingles, clay tiles, and
metal sheets -- and is designed to enhance the fidelity of global exposure
datasets through vision-language modeling (VLM). We sample EO tiles from
climatically and architecturally distinct regions to construct a representative
dataset. A subset of 6,000 images was annotated in collaboration with domain
experts to fine-tune a VLM. We used geographic- and material-aware prompt
tuning to enhance class separability. The fine-tuned model was then applied to
the remaining EO tiles, with predictions refined through rule-based and
human-in-the-loop verification. In addition to material labels, RoofNet
provides rich metadata including roof shape, footprint area, solar panel
presence, and indicators of mixed roofing materials (e.g., HVAC systems).
RoofNet supports scalable, AI-driven risk assessment and serves as a downstream
benchmark for evaluating model generalization across regions -- offering
actionable insights for insurance underwriting, disaster preparedness, and
infrastructure policy planning.