Detecting and Monitoring Bias for Subgroups in Breast Cancer Detection AI
Journal:
arXiv
Published Date:
Feb 14, 2025
Abstract
Automated mammography screening plays an important role in early breast
cancer detection. However, current machine learning models, developed on some
training datasets, may exhibit performance degradation and bias when deployed
in real-world settings. In this paper, we analyze the performance of
high-performing AI models on two mammography datasets-the Emory Breast Imaging
Dataset (EMBED) and the RSNA 2022 challenge dataset. Specifically, we evaluate
how these models perform across different subgroups, defined by six attributes,
to detect potential biases using a range of classification metrics. Our
analysis identifies certain subgroups that demonstrate notable
underperformance, highlighting the need for ongoing monitoring of these
subgroups' performance. To address this, we adopt a monitoring method designed
to detect performance drifts over time. Upon identifying a drift, this method
issues an alert, which can enable timely interventions. This approach not only
provides a tool for tracking the performance but also helps ensure that AI
models continue to perform effectively across diverse populations.