Hierarchical Multi-Positive Contrastive Learning for Patent Image Retrieval
Journal:
arXiv
Published Date:
Jun 16, 2025
Abstract
Patent images are technical drawings that convey information about a patent's
innovation. Patent image retrieval systems aim to search in vast collections
and retrieve the most relevant images. Despite recent advances in information
retrieval, patent images still pose significant challenges due to their
technical intricacies and complex semantic information, requiring efficient
fine-tuning for domain adaptation. Current methods neglect patents'
hierarchical relationships, such as those defined by the Locarno International
Classification (LIC) system, which groups broad categories (e.g., "furnishing")
into subclasses (e.g., "seats" and "beds") and further into specific patent
designs. In this work, we introduce a hierarchical multi-positive contrastive
loss that leverages the LIC's taxonomy to induce such relations in the
retrieval process. Our approach assigns multiple positive pairs to each patent
image within a batch, with varying similarity scores based on the hierarchical
taxonomy. Our experimental analysis with various vision and multimodal models
on the DeepPatent2 dataset shows that the proposed method enhances the
retrieval results. Notably, our method is effective with low-parameter models,
which require fewer computational resources and can be deployed on environments
with limited hardware.