Exploring Self-Supervised Models for Depressive Disorder Detection: A Study on Speech Corpora.

Journal: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PMID:

Abstract

Automatic detection of depressive disorder from speech signals can help improve medical diagnosis reliability. However, a significant challenge in this field is that most of the available depression datasets are relatively small, which limits the effectiveness of sophisticated deep-learning approaches. In this work, we study low-resource, speech-based depression corpora to explore the effect of speech embeddings for depression detection. The embeddings are extracted via self-supervised learning (SSL) models (Wav2Vec 2.0, HuBERT, and WavLM). After extracting the embeddings, we benchmarked them with several traditional classifiers, including a support vector machine (SVM), Logistic Regression (LR), Decision Tree (DT), and Naive Bayes classifier (NBC) to detect depression. In addition, we investigated the impact of different layer depths in the upstream networks of these SSL models. The experimental results revealed that the Wav2Vec2.0 and WavLM features generalize better than the HuBERT features. Notably, WavLM features improved accuracy by 13.1% in depression detection over the best baseline features (MFCCs). These findings will aid future research in the use of SSL models in this field.

Authors

  • Bubai Maji
  • Shazia Nasreen
  • Rajlakshmi Guha
  • Aurobinda Routray
  • Debabrata Majumdar
  • Km Poonam