The Data Artifacts Glossary: a community-based repository for bias on health datasets.

Journal: Journal of biomedical science
PMID:

Abstract

BACKGROUND: The deployment of Artificial Intelligence (AI) in healthcare has the potential to transform patient care through improved diagnostics, personalized treatment plans, and more efficient resource management. However, the effectiveness and fairness of AI are critically dependent on the data it learns from. Biased datasets can lead to AI outputs that perpetuate disparities, particularly affecting social minorities and marginalized groups.

Authors

  • Rodrigo R Gameiro
    Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Naira Link Woite
    Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Christopher M Sauer
    Department of Intensive Care Medicine, Laboratory for Critical Care Computational Intelligence (LCCI), Amsterdam Medical Data Science (AMDS), Amsterdam Cardiovascular Science (ACS), Amsterdam Institute for Infection and Immunity (AII), Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands.
  • Sicheng Hao
    Duke University School of Medicine, Durham, NC, USA.
  • Chrystinne Oliveira Fernandes
    Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil.
  • Anna E Premo
    Learning Research and Development Center, University of Pittsburgh, Pittsburgh, PA, USA.
  • Alice Rangel Teixeira
    Department of Philosophy, Universitat Autónoma de Barcelona, Barcelona, Spain.
  • Isabelle Resli
    School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.
  • An-Kwok Ian Wong
    Division of Pulmonary, Allergy, and Critical Care Medicine, Duke University, Durham, NC, USA.
  • Leo Anthony Celi
    Massachusetts Institute of Technology, Cambridge, MA, USA.