Meaningful Data Erasure in the Presence of Dependencies
Journal:
arXiv
Published Date:
Jul 1, 2025
Abstract
Data regulations like GDPR require systems to support data erasure but leave
the definition of "erasure" open to interpretation. This ambiguity makes
compliance challenging, especially in databases where data dependencies can
lead to erased data being inferred from remaining data. We formally define a
precise notion of data erasure that ensures any inference about deleted data,
through dependencies, remains bounded to what could have been inferred before
its insertion. We design erasure mechanisms that enforce this guarantee at
minimal cost. Additionally, we explore strategies to balance cost and
throughput, batch multiple erasures, and proactively compute data retention
times when possible. We demonstrate the practicality and scalability of our
algorithms using both real and synthetic datasets.