Regression trees and ensembles for cumulative incidence functions.

Journal: The international journal of biostatistics
Published Date:

Abstract

The use of cumulative incidence functions for characterizing the risk of one type of event in the presence of others has become increasingly popular over the past two decades. The problems of modeling, estimation and inference have been treated using parametric, nonparametric and semi-parametric methods. Efforts to develop suitable extensions of machine learning methods, such as regression trees and ensemble methods, have begun comparatively recently. In this paper, we propose a novel approach to estimating cumulative incidence curves in a competing risks setting using regression trees and associated ensemble estimators. The proposed methods use augmented estimators of the Brier score risk as the primary basis for building and pruning trees, and lead to methods that are easily implemented using existing R packages. Data from the Radiation Therapy Oncology Group (trial 9410) is used to illustrate these new methods.

Authors

  • Youngjoo Cho
    Department of Applied Statistics, Konkuk University, Seoul, Republic of Korea.
  • Annette M Molinaro
    Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA.
  • Chen Hu
    Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.
  • Robert L Strawderman
    Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, 14642, USA.