Access Denied: Meaningful Data Access for Quantitative Algorithm Audits
Journal:
arXiv
Published Date:
Feb 1, 2025
Abstract
Independent algorithm audits hold the promise of bringing accountability to
automated decision-making. However, third-party audits are often hindered by
access restrictions, forcing auditors to rely on limited, low-quality data. To
study how these limitations impact research integrity, we conduct audit
simulations on two realistic case studies for recidivism and healthcare
coverage prediction. We examine the accuracy of estimating group parity metrics
across three levels of access: (a) aggregated statistics, (b) individual-level
data with model outputs, and (c) individual-level data without model outputs.
Despite selecting one of the simplest tasks for algorithmic auditing, we find
that data minimization and anonymization practices can strongly increase error
rates on individual-level data, leading to unreliable assessments. We discuss
implications for independent auditors, as well as potential avenues for HCI
researchers and regulators to improve data access and enable both reliable and
holistic evaluations.