Evaluation of machine-learning models to measure individualized treatment effects from randomized clinical trial data with time-to-event outcomes
Journal:
arXiv
Published Date:
Jun 13, 2025
Abstract
In randomized clinical trials, regression models can be used to explore the
relationships between patients' variables (e.g., clinical, pathological or
lifestyle variables, and also biomarker or genomics data) and the magnitude of
treatment effect. Our aim is to evaluate the value of flexible machine learning
models that can incorporate interactions and nonlinear effects of
high-dimensional data to estimate individualized treatment recommendations in
the setting of such trials with time-to-event outcomes. We compare survival
models based on neural networks (CoxCC and CoxTime) and random survival forests
(Interaction Forests). A Cox model, including an adaptive LASSO penalty, is
used as a benchmark. Specific metrics for individualized treatment
recommendations are used: the C-for-Benefit, the E50-for-Benefit, and RMSE for
treatment benefit. We conduct an extensive simulation study using 2 different
data generation processes incorporating nonlinearity and interactions up to the
third order. The models are applied to gene expression and clinical data from 2
breast cancer studies. The machine learning-based methods show reasonable
performances on the simulation data sets, especially in terms of discrimination
for Interaction Forests and calibration for the neural networks. They can be
used to evaluate individualized treatment effects from randomized trials when
nonlinear and interaction effects are expected to be present.