Testing a global null hypothesis using ensemble machine learning methods.

Journal: Statistics in medicine
Published Date:

Abstract

Testing a global null hypothesis that there are no significant predictors for a binary outcome of interest among a large set of biomarker measurements is an important task in biomedical studies. We seek to improve the power of such testing methods by leveraging ensemble machine learning methods. Ensemble machine learning methods such as random forest, bagging, and adaptive boosting model the relationship between the outcome and the predictor nonparametrically, while stacking combines the strength of multiple learners. We demonstrate the power of the proposed testing methods through Monte Carlo studies and show the use of the methods by applying them to the immunologic biomarkers dataset from the RV144 HIV vaccine efficacy trial.

Authors

  • Sunwoo Han
    Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, USA.
  • Youyi Fong
    Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, USA. youyifong@gmail.com.
  • Ying Huang
    Department of Otolaryngology, Head and Neck Surgery, Affiliated Hospital of Southwest Medical University Luzhou, Sichuan, China.