Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting.

Journal: Journal of speech, language, and hearing research : JSLHR

PMID: 38386017

Abstract

PURPOSE: Many studies using machine learning (ML) in speech, language, and hearing sciences rely upon cross-validations with single data splitting. This study's first purpose is to provide quantitative evidence that would incentivize researchers to instead use the more robust data splitting method of nested -fold cross-validation. The second purpose is to present methods and MATLAB code to perform power analysis for ML-based analysis during the design of a study.

Authors

Hamzeh Ghasemzadeh

Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston.
Robert E Hillman

Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston.
Daryush D Mehta

Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston.

Keywords

Hearing Humans Language Machine Learning Sample Size Speech

External Resources

View on PubMed Access via DOI PubMed (38386017)

Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals