MS-DPPs: Multi-Source Determinantal Point Processes for Contextual Diversity Refinement of Composite Attributes in Text to Image Retrieval
Journal:
arXiv
Published Date:
Jul 9, 2025
Abstract
Result diversification (RD) is a crucial technique in Text-to-Image Retrieval
for enhancing the efficiency of a practical application. Conventional methods
focus solely on increasing the diversity metric of image appearances. However,
the diversity metric and its desired value vary depending on the application,
which limits the applications of RD. This paper proposes a novel task called
CDR-CA (Contextual Diversity Refinement of Composite Attributes). CDR-CA aims
to refine the diversities of multiple attributes, according to the
application's context. To address this task, we propose Multi-Source DPPs, a
simple yet strong baseline that extends the Determinantal Point Process (DPP)
to multi-sources. We model MS-DPP as a single DPP model with a unified
similarity matrix based on a manifold representation. We also introduce Tangent
Normalization to reflect contexts. Extensive experiments demonstrate the
effectiveness of the proposed method. Our code is publicly available at
https://github.com/NEC-N-SOGI/msdpp.