Pitfalls and Best Practices in Evaluation of AI Algorithmic Biases in Radiology.

Journal: Radiology
Published Date:

Abstract

Despite growing awareness of problems with fairness in artificial intelligence (AI) models in radiology, evaluation of algorithmic biases, or AI biases, remains challenging due to various complexities. These include incomplete reporting of demographic information in medical imaging datasets, variability in definitions of demographic categories, and inconsistent statistical definitions of bias. To guide the appropriate evaluation of AI biases in radiology, this article summarizes the pitfalls in the evaluation and measurement of algorithmic biases. These pitfalls span the spectrum from the technical (eg, how different statistical definitions of bias impact conclusions about whether an AI model is biased) to those associated with social context (eg, how different conventions of race and ethnicity impact identification or masking of biases). Actionable best practices and future directions to avoid these pitfalls are summarized across three key areas: medical imaging datasets, demographic definitions, and statistical evaluations of bias. Although AI bias in radiology has been broadly reviewed in the recent literature, this article focuses specifically on underrecognized potential pitfalls related to the three key areas. By providing awareness of these pitfalls along with actionable practices to avoid them, exciting AI technologies can be used in radiology for the good of all people.

Authors

  • Paul H Yi
    The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, Maryland. Electronic address: Pyi10@jhmi.edu.
  • Preetham Bachina
    From the University of Maryland Medical Intelligent Imaging (UM2ii) Center, University of Maryland School of Medicine, 670 W Baltimore St, First Floor, Room 1172, Baltimore, MD 21201 (P.B., S.P.G., P.K., A.K., V.S.P., P.H.Y.); Johns Hopkins University School of Medicine, Baltimore, Md (P.B.); Uniformed Services University of the Health Sciences, Bethesda, Md (S.P.G.); and Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Md (J.S.).
  • Beepul Bharti
    Johns Hopkins University.
  • Sean P Garin
    From the University of Maryland Medical Intelligent Imaging (UM2ii) Center, University of Maryland School of Medicine, 670 W Baltimore St, First Floor, Room 1172, Baltimore, MD 21201 (P.B., S.P.G., P.K., A.K., V.S.P., P.H.Y.); Johns Hopkins University School of Medicine, Baltimore, Md (P.B.); Uniformed Services University of the Health Sciences, Bethesda, Md (S.P.G.); and Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Md (J.S.).
  • Adway Kanhere
    Department of Otorhinolaryngology-Head and Neck Surgery, University of Maryland School of Medicine, Baltimore, MD, USA; University of Maryland Institute for Health Computing, Bethesda, MD, USA.
  • Pranav Kulkarni
    Bioinformatics Facility, CECAD Research Center, University of Cologne, Cologne, Germany.
  • David Li
    Department of Preclinical Research, Angion Biomedica Corporation, Nassau, NY 11553, USA. david_li@college.harvard.edu.
  • Vishwa S Parekh
    The Russell H. Morgan Department of Radiology and Radiological Sciences, The Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA.
  • Samantha M Santomartino
    From the University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 670 W Baltimore St, First Floor, Room 1172, Baltimore, MD 21201.
  • Linda Moy
    1 Department of Radiology, New York University School of Medicine, 160 E 34th St, New York, NY 10016.
  • Jeremias Sulam
    Johns Hopkins University.