Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning
Journal:
arXiv
Published Date:
Jul 5, 2025
Abstract
Our research addresses the overlooked security concerns related to data
poisoning in continual learning (CL). Data poisoning - the intentional
manipulation of training data to affect the predictions of machine learning
models - was recently shown to be a threat to CL training stability. While
existing literature predominantly addresses scenario-dependent attacks, we
propose to focus on a more simple and realistic single-task poison (STP)
threats. In contrast to previously proposed poisoning settings, in STP
adversaries lack knowledge and access to the model, as well as to both previous
and future tasks. During an attack, they only have access to the current task
within the data stream. Our study demonstrates that even within these stringent
conditions, adversaries can compromise model performance using standard image
corruptions. We show that STP attacks are able to strongly disrupt the whole
continual training process: decreasing both the stability (its performance on
past tasks) and plasticity (capacity to adapt to new tasks) of the algorithm.
Finally, we propose a high-level defense framework for CL along with a poison
task detection method based on task vectors. The code is available at
https://github.com/stapaw/STP.git .