Semantic-CD: Remote Sensing Image Semantic Change Detection towards Open-vocabulary Setting
Journal:
arXiv
Published Date:
Jan 12, 2025
Abstract
Remote sensing image semantic change detection is a method used to analyze
remote sensing images, aiming to identify areas of change as well as categorize
these changes within images of the same location taken at different times.
Traditional change detection methods often face challenges in generalizing
across semantic categories in practical scenarios. To address this issue, we
introduce a novel approach called Semantic-CD, specifically designed for
semantic change detection in remote sensing images. This method incorporates
the open vocabulary semantics from the vision-language foundation model, CLIP.
By utilizing CLIP's extensive vocabulary knowledge, our model enhances its
ability to generalize across categories and improves segmentation through fully
decoupled multi-task learning, which includes both binary change detection and
semantic change detection tasks. Semantic-CD consists of four main components:
a bi-temporal CLIP visual encoder for extracting features from bi-temporal
images, an open semantic prompter for creating semantic cost volume maps with
open vocabulary, a binary change detection decoder for generating binary change
detection masks, and a semantic change detection decoder for producing semantic
labels. Experimental results on the SECOND dataset demonstrate that Semantic-CD
achieves more accurate masks and reduces semantic classification errors,
illustrating its effectiveness in applying semantic priors from vision-language
foundation models to SCD tasks.