AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection
Journal:
arXiv
Published Date:
May 21, 2025
Abstract
Omni-domain infrared small target detection (IRSTD) poses formidable
challenges, as a single model must seamlessly adapt to diverse imaging systems,
varying resolutions, and multiple spectral bands simultaneously. Current
approaches predominantly rely on visual-only modeling paradigms that not only
struggle with complex background interference and inherently scarce target
features, but also exhibit limited generalization capabilities across complex
omni-scene environments where significant domain shifts and appearance
variations occur. In this work, we reveal a critical oversight in existing
paradigms: the neglect of readily available auxiliary metadata describing
imaging parameters and acquisition conditions, such as spectral bands, sensor
platforms, resolution, and observation perspectives. To address this
limitation, we propose the Auxiliary Metadata Driven Infrared Small Target
Detector (AuxDet), a novel multi-modal framework that fundamentally reimagines
the IRSTD paradigm by incorporating textual metadata for scene-aware
optimization. Through a high-dimensional fusion module based on multi-layer
perceptrons (MLPs), AuxDet dynamically integrates metadata semantics with
visual features, guiding adaptive representation learning for each individual
sample. Additionally, we design a lightweight prior-initialized enhancement
module using 1D convolutional blocks to further refine fused features and
recover fine-grained target cues. Extensive experiments on the challenging
WideIRSTD-Full benchmark demonstrate that AuxDet consistently outperforms
state-of-the-art methods, validating the critical role of auxiliary information
in improving robustness and accuracy in omni-domain IRSTD tasks. Code is
available at https://github.com/GrokCV/AuxDet.