DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Journal:
arXiv
Published Date:
Dec 14, 2024
Abstract
Multi-modal object Re-IDentification (ReID) aims to retrieve specific objects
by combining complementary information from multiple modalities. Existing
multi-modal object ReID methods primarily focus on the fusion of heterogeneous
features. However, they often overlook the dynamic quality changes in
multi-modal imaging. In addition, the shared information between different
modalities can weaken modality-specific information. To address these issues,
we propose a novel feature learning framework called DeMo for multi-modal
object ReID, which adaptively balances decoupled features using a mixture of
experts. To be specific, we first deploy a Patch-Integrated Feature Extractor
(PIFE) to extract multi-granularity and multi-modal features. Then, we
introduce a Hierarchical Decoupling Module (HDM) to decouple multi-modal
features into non-overlapping forms, preserving the modality uniqueness and
increasing the feature diversity. Finally, we propose an Attention-Triggered
Mixture of Experts (ATMoE), which replaces traditional gating with dynamic
attention weights derived from decoupled features. With these modules, our DeMo
can generate more robust multi-modal features. Extensive experiments on three
multi-modal object ReID benchmarks fully verify the effectiveness of our
methods. The source code is available at https://github.com/924973292/DeMo.