Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection
Journal:
arXiv
Published Date:
May 6, 2025
Abstract
3D mask presentation attack detection is crucial for protecting face
recognition systems against the rising threat of 3D mask attacks. While most
existing methods utilize multimodal features or remote photoplethysmography
(rPPG) signals to distinguish between real faces and 3D masks, they face
significant challenges, such as the high costs associated with multimodal
sensors and limited generalization ability. Detection-related text descriptions
offer concise, universal information and are cost-effective to obtain. However,
the potential of vision-language multimodal features for 3D mask presentation
attack detection remains unexplored. In this paper, we propose a novel
knowledge-based prompt learning framework to explore the strong generalization
capability of vision-language models for 3D mask presentation attack detection.
Specifically, our approach incorporates entities and triples from knowledge
graphs into the prompt learning process, generating fine-grained, task-specific
explicit prompts that effectively harness the knowledge embedded in pre-trained
vision-language models. Furthermore, considering different input images may
emphasize distinct knowledge graph elements, we introduce a visual-specific
knowledge filter based on an attention mechanism to refine relevant elements
according to the visual context. Additionally, we leverage causal graph theory
insights into the prompt learning process to further enhance the generalization
ability of our method. During training, a spurious correlation elimination
paradigm is employed, which removes category-irrelevant local image patches
using guidance from knowledge-based text features, fostering the learning of
generalized causal prompts that align with category-relevant local patches.
Experimental results demonstrate that the proposed method achieves
state-of-the-art intra- and cross-scenario detection performance on benchmark
datasets.