A Universal Detection Method for Adversarial Examples and Fake Images.

Journal: Sensors (Basel, Switzerland)
PMID:

Abstract

Deep-learning technologies have shown impressive performance on many tasks in recent years. However, there are multiple serious security risks when using deep-learning technologies. For examples, state-of-the-art deep-learning technologies are vulnerable to adversarial examples that make the model's predictions wrong due to some specific subtle perturbation, and these technologies can be abused for the tampering with and forgery of multimedia, i.e., deep forgery. In this paper, we propose a universal detection framework for adversarial examples and fake images. We observe some differences in the distribution of model outputs for normal and adversarial examples (fake images) and train the detector to learn the differences. We perform extensive experiments on the CIFAR10 and CIFAR100 datasets. Experimental results show that the proposed framework has good feasibility and effectiveness in detecting adversarial examples or fake images. Moreover, the proposed framework has good generalizability for the different datasets and model structures.

Authors

  • Jiewei Lai
    School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
  • Yantong Huo
    School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China.
  • Ruitao Hou
    Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou 510006, China.
  • Xianmin Wang
    Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou 510006, China.