Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.
Journal:
Journal of biomedical informatics
Published Date:
Dec 2, 2023
Abstract
INTRODUCTION: Advances in wearable sensor technology have enabled the collection of biomarkers that may correlate with levels of elevated stress. While significant research has been done in this domain, specifically in using machine learning to detect elevated levels of stress, the challenge of producing a machine learning model capable of generalizing well for use on new, unseen data remain. Acute stress response has both subjective, psychological and objectively measurable, biological components that can be expressed differently from person to person, further complicating the development of a generic stress measurement model. Another challenge is the lack of large, publicly available datasets labeled for stress response that can be used to develop robust machine learning models. In this paper, we first investigate the generalization ability of models built on datasets containing a small number of subjects, recorded in single study protocols. Next, we propose and evaluate methods combining these datasets into a single, large dataset to study the generalization capability of machine learning models built on larger datasets. Finally, we propose and evaluate the use of ensemble techniques by combining gradient boosting with an artificial neural network to measure predictive power on new, unseen data. In favor of reproducible research and to assist the community advance the field, we make all our experimental data and code publicly available through Github at https://github.com/xalentis/Stress. This paper's in-depth study of machine learning model generalization for stress detection provides an important foundation for the further study of stress response measurement using sensor biomarkers, recorded with wearable technologies.