Causes of Outcome Learning: a causal inference-inspired machine learning approach to disentangling common combinations of potential causes of a health outcome.

Journal: International journal of epidemiology
PMID:

Abstract

Nearly all diseases are caused by different combinations of exposures. Yet, most epidemiological studies focus on estimating the effect of a single exposure on a health outcome. We present the Causes of Outcome Learning approach (CoOL), which seeks to discover combinations of exposures that lead to an increased risk of a specific outcome in parts of the population. The approach allows for exposures acting alone and in synergy with others. The road map of CoOL involves (i) a pre-computational phase used to define a causal model; (ii) a computational phase with three steps, namely (a) fitting a non-negative model on an additive scale, (b) decomposing risk contributions and (c) clustering individuals based on the risk contributions into subgroups; and (iii) a post-computational phase on hypothesis development, validation and triangulation using new data before eventually updating the causal model. The computational phase uses a tailored neural network for the non-negative model on an additive scale and layer-wise relevance propagation for the risk decomposition through this model. We demonstrate the approach on simulated and real-life data using the R package 'CoOL'. The presentation focuses on binary exposures and outcomes but can also be extended to other measurement types. This approach encourages and enables researchers to identify combinations of exposures as potential causes of the health outcome of interest. Expanding our ability to discover complex causes could eventually result in more effective, targeted and informed interventions prioritized for their public health impact.

Authors

  • Andreas Rieckmann
    Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark.
  • Piotr Dworzynski
    Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark.
  • Leila Arras
    Machine Learning Group, Fraunhofer Heinrich Hertz Institute, Berlin, Germany.
  • Sebastian Lapuschkin
    Department of Video Coding & Analytics, Fraunhofer Heinrich Hertz Institute, Berlin, Germany.
  • Wojciech Samek
    Machine Learning Group, Fraunhofer Heinrich Hertz Institute, Berlin, Germany.
  • Onyebuchi Aniweta Arah
    Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles, CA, USA.
  • Naja Hulvej Rod
    Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark.
  • Claus Thorn Ekstrøm
    Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark.