An efficient low-shot class-agnostic counting framework with hybrid encoder and iterative exemplar feature learning.
Journal:
PloS one
Published Date:
Jun 6, 2025
Abstract
Few-shot learning techniques have enabled the rapid adaptation of a general AI model to various tasks using limited data. In this study, we focus on class-agnostic low-shot object counting, a challenging problem that aims to achieve accurate object counting with only a few annotated samples (few-shot) or even in the absence of any annotated data (zero-shot). In existing methods, the primary focus is often on enhancing performance, while relatively little attention is given to inference time-an equally critical factor in many practical applications. We propose a model that achieves real-time inference without compromising performance. Specifically, we design a multi-scale hybrid encoder to enhance feature representation and optimize computational efficiency. This encoder applies self-attention exclusively to high-level features and cross-scale fusion modules to integrate adjacent features, reducing training costs. Additionally, we introduce a learnable shape embedding and an iterative exemplar feature learning module, that progressively enriches exemplar features with class-level characteristics by learning from similar objects within the image, which are essential for improving subsequent matching performance. Extensive experiments on the FSC147, Val-COCO, Test-COCO, CARPK, and ShanghaiTech datasets demonstrate our model's effectiveness and generalizability compared to state-of-the-art methods.