Erasing with Precision: Evaluating Specific Concept Erasure from Text-to-Image Generative Models
Journal:
arXiv
Published Date:
Feb 19, 2025
Abstract
Studies have been conducted to prevent specific concepts from being generated
from pretrained text-to-image generative models, achieving concept erasure in
various ways. However, the performance evaluation of these studies is still
largely reliant on visualization, with the superiority of studies often
determined by human subjectivity. The metrics of quantitative evaluation also
vary, making comprehensive comparisons difficult. We propose EraseEval, an
evaluation method that differs from previous evaluation methods in that it
involves three fundamental evaluation criteria: (1) How well does the prompt
containing the target concept be reflected, (2) To what extent the concepts
related to the erased concept can reduce the impact of the erased concept, and
(3) Whether other concepts are preserved. These criteria are evaluated and
integrated into a single metric, such that a lower score is given if any of the
evaluations are low, leading to a more robust assessment. We experimentally
evaluated baseline concept erasure methods, organized their characteristics,
and identified challenges with them. Despite being fundamental evaluation
criteria, some concept erasure methods failed to achieve high scores, which
point toward future research directions for concept erasure methods. Our code
is available at https://github.com/fmp453/erase-eval.