The Harms of Class Imbalance Corrections for Machine Learning Based Prediction Models: A Simulation Study.

Journal: Statistics in medicine

PMID: 39865585

Abstract

INTRODUCTION: Risk prediction models are increasingly used in healthcare to aid in clinical decision-making. In most clinical contexts, model calibration (i.e., assessing the reliability of risk estimates) is critical. Data available for model development are often not perfectly balanced with the modeled outcome (i.e., individuals with vs. without the event of interest are not equally prevalent in the data). It is common for researchers to correct for class imbalance, yet, the effect of such imbalance corrections on the calibration of machine learning models is largely unknown.

Authors

Alex Carriero

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
Kim Luijken

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
Anne de Hond

Department of Digital Health, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, CG Utrecht, the Netherlands.
Karel G M Moons

Julius Center for Health Sciences and Primary Care, and Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.
Ben Van Calster
Maarten van Smeden

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands.

Keywords

Algorithms Computer Simulation Humans Machine Learning Models, Statistical Monte Carlo Method Reproducibility of Results Risk Assessment

External Resources

View on PubMed Access via DOI PubMed (39865585)

The Harms of Class Imbalance Corrections for Machine Learning Based Prediction Models: A Simulation Study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals