Machine Learning Methods for Estimating Personalized Treatment Effects-Insights on validity from two large trials.

Journal: American journal of epidemiology

Published Date: Mar 20, 2026

Abstract

Machine learning (ML) methods have the potential to improve precision medicine by estimating personalized treatment effects. However, formal validation of these methods remains limited, leaving their reliability in empirical settings largely uncertain. In this study, we evaluated the internal and external validity of 17 causal heterogeneity ML methods-including metalearners, tree-based methods, and deep learning methods-using data from two large randomized controlled trials: the International Stroke Trial (n = 19 435) and the Chinese Acute Stroke Trial (n = 21 106). We assessed performance using three visual-based metrics and three quantitative metrics. Our analysis found that none of the ML methods consistently demonstrated reliable performance, neither internal nor external. Heterogeneous treatment effects estimated from training data failed to generalize to the test data, even in the absence of distribution shifts. These results raise concerns about the current applicability of ML models in precision medicine and highlight the need for more robust validation techniques to ensure generalizability.

Machine Learning Methods for Estimating Personalized Treatment Effects-Insights on validity from two large trials.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Machine Learning Methods for Estimating Personalized Treatment Effects-Insights on validity from two large trials.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals