Artificial intelligence tools trained on human-labeled data reflect human biases: a case study in a large clinical consecutive knee osteoarthritis cohort.

Journal: Scientific reports
Published Date:

Abstract

Humans have been shown to have biases when reading medical images, raising questions about whether humans are uniform in their disease gradings. Artificial intelligence (AI) tools trained on human-labeled data may have inherent human non-uniformity. In this study, we used a radiographic knee osteoarthritis external validation dataset of 50 patients and a six-year retrospective consecutive clinical cohort of 8,273 patients. An FDA-approved and CE-marked AI tool was tested for potential non-uniformity in Kellgren-Lawrence grades between the right and left sides of the images. We flipped the images horizontally so that a left knee looked like a right knee and vice versa. According to human review, the AI tool showed non-uniformity with 20-22% disagreements on the external validation dataset and 13.6% on the cohort. However, we found no evidence of a significant difference in the accuracy compared to senior radiologists on the external validation dataset, or age bias or sex bias on the cohort. AI non-uniformity can boost the evaluated performance against humans, but image areas with inferior performance should be investigated.

Authors

  • Anders Lenskjold
    From the Department of Radiology (M.W.B., A.L., F.C.M., J.U.N., D.I.R., C.T.N., M.B.), The Parker Institute (M.W.B., A.L., J.U.N., C.T.N., M.B.), and Department of Orthopaedic Surgery (C.U.S.), Bispebjerg and Frederiksberg Hospital, Bispebjerg Bakke 23, 2400 Copenhagen, Denmark; Radiologic AI Testcenter, Copenhagen, Denmark (M.W.B., A.L., F.C.M., J.U.N., C.T.N., M.B.); Departments of Radiology (K.Z., H.C., S.A.D., K.G.A.H.) and Orthopedic Surgery (B.B., M.M.), Charité Universitätsmedizin-Berlin, Berlin, Germany; Departments of Radiology & Nuclear Medicine (H.R., J.J.V., L.M.S., E.H.G.O.) and Orthopedic Surgery (J.J.), Erasmus MC, Rotterdam, the Netherlands; Department of Radiology, Herlev and Gentofte, Copenhagen, Denmark (F.C.M.); and Department of Orthopedic Surgery, Copenhagen University Hospital Hvidovre, Hvidovre, Denmark (A.G.).
  • Mathias W Brejnebøl
    Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Radiology, Bispebjerg and Frederiksberg Hospital, Frederiksberg, Denmark.
  • Martin H Rose
    Center for Surgical Science, Zealand University Hospital, Køge, Denmark.
  • Henrik Gudbergsen
    The Parker Institute, University of Copenhagen, Copenhagen, Denmark.
  • Akshay Chaudhari
    Stanford University, Stanford, CA, USA.
  • Anders Troelsen
    Department of Orthopedics, Copenhagen University Hospital, Hvidovre, Denmark.
  • Anne Moller
    Department of Public Health, Center for General Practice, University of Copenhagen, Copenhagen, Denmark.
  • Janus U Nybing
    From the Department of Radiology (M.W.B., A.L., F.C.M., J.U.N., D.I.R., C.T.N., M.B.), The Parker Institute (M.W.B., A.L., J.U.N., C.T.N., M.B.), and Department of Orthopaedic Surgery (C.U.S.), Bispebjerg and Frederiksberg Hospital, Bispebjerg Bakke 23, 2400 Copenhagen, Denmark; Radiologic AI Testcenter, Copenhagen, Denmark (M.W.B., A.L., F.C.M., J.U.N., C.T.N., M.B.); Departments of Radiology (K.Z., H.C., S.A.D., K.G.A.H.) and Orthopedic Surgery (B.B., M.M.), Charité Universitätsmedizin-Berlin, Berlin, Germany; Departments of Radiology & Nuclear Medicine (H.R., J.J.V., L.M.S., E.H.G.O.) and Orthopedic Surgery (J.J.), Erasmus MC, Rotterdam, the Netherlands; Department of Radiology, Herlev and Gentofte, Copenhagen, Denmark (F.C.M.); and Department of Orthopedic Surgery, Copenhagen University Hospital Hvidovre, Hvidovre, Denmark (A.G.).
  • Mikael Boesen
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib Juuls vej 1, 2730 Herlev, Copenhagen, Denmark (L.L.P., F.C.M., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital region of Denmark (L.L.P., F.C.M., J.D.N., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (J.D.N., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).