Critical appraisal of fairness metrics for artificial intelligence-based clinical prediction models: a scoping review.
Journal:
The Lancet. Digital health
Published Date:
May 28, 2026
Abstract
Predictive artificial intelligence (AI) offers an opportunity to improve clinical practice and patient outcomes but risks perpetuating biases if fairness is inadequately addressed. However, the definition of fairness remains unclear. We conducted a scoping review to identify and critically appraise fairness metrics in clinical predictive AI models. We defined a fairness metric as a metric quantifying whether a model discriminates (societally) against individuals or groups defined by sensitive attributes. We searched five databases for literature published during 2014-24, screened 820 records, included 42 studies, and extracted 63 fairness metrics. The search was limited to studies published in English. These metrics, which were classified by performance dependency, model output level, and base performance metric, revealed a fragmented landscape in the field of clinical predictive AI, with inadequate clinical validation and over-reliance on threshold-dependent metrics. 19 metrics, including only one metric for clinical use, were explicitly developed for health care. Our findings highlight conceptual challenges in defining and quantifying fairness and identify gaps in uncertainty quantification, intersectionality, and real-world applicability. Therefore, future works on clinical predictive AI models should prioritise clinically meaningful metrics.
Authors
Keywords
No keywords available for this article.