Biases in machine-learning models of human single-cell data.

Journal: Nature cell biology
PMID:

Abstract

Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.

Authors

  • Theresa Willem
    Institute of History and Ethics in Medicine, Department of Preclinical Medicine, TUM School of Medicine and Health, Technical University of Munich, Ismaninger Straße 22, 81675, Munich, Germany. theresa.willem@tum.de.
  • Vladimir A Shitov
    Department of Computational Health, Institute of Computational Biology, Helmholtz Munich, Munich, Germany.
  • Malte D Luecken
    Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.
  • Niki Kilbertus
    Helmholtz Munich, Munich, Germany.
  • Stefan Bauer
    Department of Computer Science, ETH Zurich, Zürich, Switzerland.
  • Marie Piraud
    Department of Informatics, Technische Universität München, Munich, Germany.
  • Alena Buyx
    Institute for History and Ethics of Medicine, Technical University of Munich School of Medicine, Technical University of Munich, Munich, Germany.
  • Fabian J Theis
    Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich, Germany.