PuFace: Defending against Facial Cloaking Attacks for Facial Recognition Models
Journal:
arXiv
Published Date:
Jun 4, 2024
Abstract
The recently proposed facial cloaking attacks add invisible perturbation
(cloaks) to facial images to protect users from being recognized by
unauthorized facial recognition models. However, we show that the "cloaks" are
not robust enough and can be removed from images.
This paper introduces PuFace, an image purification system leveraging the
generalization ability of neural networks to diminish the impact of cloaks by
pushing the cloaked images towards the manifold of natural (uncloaked) images
before the training process of facial recognition models. Specifically, we
devise a purifier that takes all the training images including both cloaked and
natural images as input and generates the purified facial images close to the
manifold where natural images lie. To meet the defense goal, we propose to
train the purifier on particularly amplified cloaked images with a loss
function that combines image loss and feature loss. Our empirical experiment
shows PuFace can effectively defend against two state-of-the-art facial
cloaking attacks and reduces the attack success rate from 69.84\% to 7.61\% on
average without degrading the normal accuracy for various facial recognition
models. Moreover, PuFace is a model-agnostic defense mechanism that can be
applied to any facial recognition model without modifying the model structure.