Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
Journal:
arXiv
Published Date:
Apr 23, 2025
Abstract
Infrared small target detection (ISTD) is highly sensitive to sensor type,
observation conditions, and the intrinsic properties of the target. These
factors can introduce substantial variations in the distribution of acquired
infrared image data, a phenomenon known as domain shift. Such distribution
discrepancies significantly hinder the generalization capability of ISTD models
across diverse scenarios. To tackle this challenge, this paper introduces an
ISTD framework enhanced by domain adaptation. To alleviate distribution shift
between datasets and achieve cross-sample alignment, we introduce Cross-view
Channel Alignment (CCA). Additionally, we propose the Cross-view Top-K Fusion
strategy, which integrates target information with diverse background features,
enhancing the model' s ability to extract critical data characteristics. To
further mitigate the impact of noise on ISTD, we develop a Noise-guided
Representation learning strategy. This approach enables the model to learn more
noise-resistant feature representations, to improve its generalization
capability across diverse noisy domains. Finally, we develop a dedicated
infrared small target dataset, RealScene-ISTD. Compared to state-of-the-art
methods, our approach demonstrates superior performance in terms of detection
probability (Pd), false alarm rate (Fa), and intersection over union (IoU). The
code is available at: https://github.com/luy0222/RealScene-ISTD.