Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

Journal: arXiv
Published Date:

Abstract

Machine unlearning aims to remove the influence of specific training samples from a trained model without full retraining. While prior work has largely focused on privacy-motivated settings, we recast unlearning as a general-purpose tool for post-deployment model revision. Specifically, we focus on utilizing unlearning in clinical contexts where data shifts, device deprecation, and policy changes are common. To this end, we propose a bilevel optimization formulation of boundary-based unlearning that can be solved using iterative algorithms. We provide convergence guarantees when first-order algorithms are used to unlearn. Our method introduces tunable loss design for controlling the forgetting-retention tradeoff and supports novel model composition strategies that merge the strengths of distinct unlearning runs. Across benchmark and real-world clinical imaging datasets, our approach outperforms baselines on both forgetting and retention metrics, including scenarios involving imaging devices and anatomical outliers. This work establishes machine unlearning as a modular, practical alternative to retraining for real-world model maintenance in clinical applications.

Authors

  • George R. Nahass
  • Zhu Wang
  • Homa Rashidisabet
  • Won Hwa Kim
  • Sasha Hubschman
  • Jeffrey C. Peterson
  • Ghasem Yazdanpanah
  • Chad A. Purnell
  • Pete Setabutr
  • Ann Q. Tran
  • Darvin Yi
  • Sathya N. Ravi