Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting.
Journal:
Journal of speech, language, and hearing research : JSLHR
PMID:
38386017
Abstract
PURPOSE: Many studies using machine learning (ML) in speech, language, and hearing sciences rely upon cross-validations with single data splitting. This study's first purpose is to provide quantitative evidence that would incentivize researchers to instead use the more robust data splitting method of nested -fold cross-validation. The second purpose is to present methods and MATLAB code to perform power analysis for ML-based analysis during the design of a study.