A structural characterization of shortcut features for prediction.

Journal: European journal of epidemiology
PMID:

Abstract

With the rising use of machine learning for healthcare applications, practitioners are increasingly confronted with the limitations of prediction models that are trained in one setting but meant to be deployed in several others. One recently identified limitation is so-called shortcut learning, whereby a model learns to associate features with the prediction target that do not maintain their relationship across settings. Famously, the watermark on chest x-rays has been demonstrated to be an instance of a shortcut feature. In this viewpoint, we attempt to give a structural characterization of shortcut features in terms of causal DAGs. This is the first attempt at defining shortcut features in terms of their causal relationship with a model's prediction target.

Authors

  • David Bellamy
    CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
  • Miguel A HernĂ¡n
  • Andrew Beam
    Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States of America; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.