A machine learning-based approach for estimating and testing associations with multivariate outcomes.

Journal: The international journal of biostatistics
Published Date:

Abstract

We propose a method for summarizing the strength of association between a set of variables and a multivariate outcome. Classical summary measures are appropriate when linear relationships exist between covariates and outcomes, while our approach provides an alternative that is useful in situations where complex relationships may be present. We utilize machine learning to detect nonlinear relationships and covariate interactions and propose a measure of association that captures these relationships. A hypothesis test about the proposed associative measure can be used to test the strong null hypothesis of no association between a set of variables and a multivariate outcome. Simulations demonstrate that this hypothesis test has greater power than existing methods against alternatives where covariates have nonlinear relationships with outcomes. We additionally propose measures of variable importance for groups of variables, which summarize each groups' association with the outcome. We demonstrate our methodology using data from a birth cohort study on childhood health and nutrition in the Philippines.

Authors

  • David Benkeser
    Group in Biostatistics, University of California, Berkeley, Berkeley 101 Haviland HallCA, U.S.A.
  • Andrew Mertens
    Department of Epidemiology, University of California, Berkeley, Berkeley, USA.
  • John M Colford
    Department of Epidemiology, University of California, Berkeley, Berkeley, USA.
  • Alan Hubbard
    Division of Biostatistics, University of California, Berkeley, California, United States of America.
  • Benjamin F Arnold
    Francis I. Proctor Foundation, University of California, San Fransisco, USA.
  • Aryeh Stein
    Hubert Department of Global Health, Emory University Rollins School of Public Health, Atlanta, USA.
  • Mark J van der Laan
    Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA.