CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

Journal: arXiv

Published Date: Jul 7, 2025

Abstract

Deep Neural Networks (DNNs) are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and clean-image backdoors. To address them, we introduce CLIP-Guided backdoor Defense (CGD), an efficient and effective method that mitigates various backdoor attacks. CGD utilizes a publicly accessible CLIP model to identify inputs that are likely to be clean or poisoned. It then retrains the model with these inputs, using CLIP's logits as a guidance to effectively neutralize the backdoor. Experiments on 4 datasets and 11 attack types demonstrate that CGD reduces attack success rates (ASRs) to below 1% while maintaining clean accuracy (CA) with a maximum drop of only 0.3%, outperforming existing defenses. Additionally, we show that clean-data-based defenses can be adapted to poisoned data using CGD. Also, CGD exhibits strong robustness, maintaining low ASRs even when employing a weaker CLIP model or when CLIP itself is compromised by a backdoor. These findings underscore CGD's exceptional efficiency, effectiveness, and applicability for real-world backdoor defense scenarios. Code: https://github.com/binyxu/CGD.

Authors

Binyan Xu
Fan Yang
Xilin Dai
Di Tang
Kehuan Zhang

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2507.05113v1)

CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals