Auto-Masked Audio Spectrogram Transformer for depression detection from speech.

Journal: Journal of affective disorders
Published Date:

Abstract

BACKGROUND: Depression is a psychological disorder characterized by altered self-referential cognition and impaired emotional expression. Traditional diagnostic methods can be costly or intrusive, while Speech-based analysis offers an accessible alternative for early detection. METHOD: This study introduces the Auto-Masked Audio Spectrogram Transformer (AMAST), a deep learning framework that extracts depression-related features from speech spectrograms. AMAST incorporates sliding window segmentation, auto-masked training to enhance contextual learning, and a time-frequency attention mechanism to capture both time and frequency information. RESULT: AMAST achieved F1 scores of 0.92 on the Distress Analysis Interview Corpus-Wizard of Oz dataset and 0.91 on the Multi-modal Open Dataset for Mental disorder Analysis dataset, outperforming baseline models. Emotionally evocative tasks such as word reading and interviews significantly improved classification performance. The model demonstrated robustness in detecting subtle depressive speech markers across various speaking conditions. CONCLUSION: AMAST provides a promising tool for non-invasive depression screening. Its effectiveness across diverse tasks and datasets supports its potential use in clinical and remote mental health assessments. Our code is available at https://github.com/zmc314/AMAST.

Authors

  • Mianchen Zhang
    College of Computer Science and Technology, Beijing University of Technology, No. 100, Pingleyuan, Beijing 100124, China. Electronic address: [email protected].
  • Jian He
    School of Software Engineering, Beijing University of Technology, Beijing, China. Electronic address: [email protected].
  • Xiaolan Peng
    Institute of Software Chinese Academy of Sciences, No. 4 South Fourth Street, Zhong Guan Cun, Beijing 100190, China.
  • Jin Huang
    College of Life Science, Yangtze University, Jingzhou, Hubei 434023, P. R. China; Institute of Agricultural Products Processing, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, PR China.
  • Ning Zhang
    Institute of Nuclear Agricultural Sciences, Zhejiang University, Hangzhou, 310058, China.
  • Chunxue Wang
    Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
  • Di Jiang
    College of Engineering, China Agricultural University, Beijing 100083, China.