Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching
Journal:
arXiv
Published Date:
Apr 23, 2025
Abstract
Fairness-aware learning aims to mitigate discrimination against specific
protected social groups (e.g., those categorized by gender, ethnicity, age)
while minimizing predictive performance loss. Despite efforts to improve
fairness in machine learning, prior studies have shown that many models remain
unfair when measured against various fairness metrics. In this paper, we
examine whether the way training and testing data are sampled affects the
reliability of reported fairness metrics. Since training and test sets are
often randomly sampled from the same population, bias present in the training
data may still exist in the test data, potentially skewing fairness
assessments. To address this, we propose FairMatch, a post-processing method
that applies propensity score matching to evaluate and mitigate bias. FairMatch
identifies control and treatment pairs with similar propensity scores in the
test set and adjusts decision thresholds for different subgroups accordingly.
For samples that cannot be matched, we perform probabilistic calibration using
fairness-aware loss functions. Experimental results demonstrate that our
approach can (a) precisely locate subsets of the test data where the model is
unbiased, and (b) significantly reduce bias on the remaining data. Overall,
propensity score matching offers a principled way to improve both fairness
evaluation and mitigation, without sacrificing predictive performance.