A comprehensive survey and comparative analysis of time series data augmentation in medical wearable computing.

Journal: PloS one

PMID: 40100883

Abstract

Recent advancements in hardware technology have spurred a surge in the popularity and ubiquity of wearable sensors, opening up new applications within the medical domain. This proliferation has resulted in a notable increase in the availability of Time Series (TS) data characterizing behavioral or physiological information from the patient, leading to initiatives toward leveraging machine learning and data analysis techniques. Nonetheless, the complexity and time required for collecting data remain significant hurdles, limiting dataset sizes and hindering the effectiveness of machine learning. Data Augmentation (DA) stands out as a prime solution, facilitating the generation of synthetic data to address challenges associated with acquiring medical data. DA has shown to consistently improve performances when images are involved. As a result, investigations have been carried out to check DA for TS, in particular for TS classification. However, the current state of DA in TS classification faces challenges, including methodological taxonomies restricted to the univariate case, insuﬃcient direction to select suitable DA methods and a lack of conclusive evidence regarding the amount of synthetic data required to attain optimal outcomes. This paper conducts a comprehensive survey and experiments on DA techniques for TS and their application to TS classification. We propose an updated taxonomy spanning across three families of Time Series Data Augmentation (TSDA): Random Transformation (RT), Pattern Mixing (PM), and Generative Models (GM). Additionally, we empirically evaluate 12 TSDA methods across diverse datasets used in medical-related applications, including OPPORTUNITY and HAR for Human Activity Recognition, DEAP for emotion recognition, BioVid Heat Pain Database (BVDB), and PainMonit Database (PMDB) for pain recognition. Through comprehensive experimental analysis, we identify the most optimal DA techniques and provide recommendations for researchers regarding the generation of synthetic data to maximize outcomes from DA methods. Our findings show that despite their simplicity, DA methods of the RT family are the most consistent in increasing performances compared to not using any augmentation.

Authors

Md Abid Hasan

Department of Computer Science and Engineering, University of California Riverside, 900 University Ave, Riverside, 92507, CA, USA. mhasa006@ucr.edu.
Frédéric Li

Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, Lübeck 23538, Germany. Electronic address: li@imi.uni-luebeck.de.
Philip Gouverneur

Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany.
Artur Piet

Institute of Medical Informatics, University of Luebeck, Ratzeburger Allee 160, 23562, Luebeck, Germany. ar.piet@uni-luebeck.de.
Marcin Grzegorzek

Institute for Vision and Graphics, University of Siegen, Hoerlindstr. 3, 57076 Siegen, Germany.

Keywords

Humans Machine Learning Surveys and Questionnaires Wearable Electronic Devices

External Resources

View on PubMed Access via DOI PubMed (40100883)

A comprehensive survey and comparative analysis of time series data augmentation in medical wearable computing.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals