Leveraging Intermediate Features of Vision Transformer for Face Anti-Spoofing
Journal:
arXiv
Published Date:
May 30, 2025
Abstract
Face recognition systems are designed to be robust against changes in head
pose, illumination, and blurring during image capture. If a malicious person
presents a face photo of the registered user, they may bypass the
authentication process illegally. Such spoofing attacks need to be detected
before face recognition. In this paper, we propose a spoofing attack detection
method based on Vision Transformer (ViT) to detect minute differences between
live and spoofed face images. The proposed method utilizes the intermediate
features of ViT, which have a good balance between local and global features
that are important for spoofing attack detection, for calculating loss in
training and score in inference. The proposed method also introduces two data
augmentation methods: face anti-spoofing data augmentation and patch-wise data
augmentation, to improve the accuracy of spoofing attack detection. We
demonstrate the effectiveness of the proposed method through experiments using
the OULU-NPU and SiW datasets.