Depression detection methods based on multimodal fusion of voice and text.

Journal: Scientific reports
Published Date:

Abstract

Depression is a prevalent mental health disorder, and early detection is crucial for timely intervention. Traditional diagnostics often rely on subjective judgments, leading to variability and inefficiency. This study proposes a fusion model for automated depression detection, leveraging bimodal data from voice and text. Wav2Vec 2.0 and BERT pre-trained models were utilized for feature extraction, while a multi-scale convolutional layer and Bi-LSTM network were employed for feature fusion and classification. Adaptive pooling was used to integrate features, enabling simultaneous depression classification and PHQ-8 severity estimation within a unified system.Experiments on the CMDC and DAIC datasets demonstrate the model's effectiveness. On CMDC, the F1 score improved by 0.0103 and 0.2017 compared to voice-only and text-only models, respectively, while RMSE decreased by 0.5186. On DAIC, the F1 score increased by 0.0645 and 0.2589, with RMSE reduced by 1.9901. These results highlight the proposed method's ability to capture and integrate multi-level information across modalities, significantly improving the accuracy and reliability of automated depression detection and severity prediction.

Authors

  • Zhenrong Xu
    School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China.
  • Yuan Gao
    Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou Zhejiang Province, China.
  • Fang Wang
    Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China.
  • Longqian Zhang
    School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China.
  • Li Zhang
    Department of Animal Nutrition and Feed Science, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
  • Junke Wang
    School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China.
  • Jie Shu
    School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China.