Multimodal Depression Recognition via Mutual Information Maximization Joint with Multi-task Learning.

Journal: IEEE transactions on bio-medical engineering

Published Date: Jun 24, 2025

Abstract

Depression is a serious mental health disorder with a potential hazard for individuals and society characterized by persistent sadness and hopelessness. Multimodal information including vision, audio, and text is critical for depression diagnosis and treatment. Most studies focus on designing sophisticated feature extraction methods but ignore feature enhancement and fusion within intra-modality and cross-modality. In this paper, a Chinese Multimodal Depression Corpus (CMD-Corpus) dataset is established assisted by clinical experts aiming to support more depression research. Furthermore, we propose a multimodal depression recognition framework based on Mutual Information Maximization with Multi-task Learning (MIMML) to enhance feature representation and fusion among video, audio, and text modalities. The MIMML employs the strategy of maximizing mutual information to accelerate modality-invariance enhancement. The multi-task is used to improve the representation performance of the single modality to improve modality-specific enhancement. Meanwhile, a gated structure with bidirectional gated recurrent units and convolutional neural networks is designed to achieve multimodal feature fusion, which is key to boosting completeness among modalities. Experimental results show that the proposed MIMML effectively captures representation to increase depression recognition accuracy, achieving 84% and 89% accuracy on DAIC-WOZ and our self-collected CMD-Corpus dataset respectively.

Authors

Yanjie Liu

State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, Heilongjiang, China.
Yulan Liu

Department of Gastroenterology, Peking University People's Hospital, Beijing, China.
Xinran Ma
Rui Wang

Department of Clinical Laboratory Medicine Center, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia, China.
Qin Yang

State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha 410082, China; School of Physics and Optoelectronic Engineering, Yangtze University, Jingzhou 434023, China.
Yijun Mo

Department of Laboratory Medicine, The First Affiliated Hospital of Ningbo University, Ningbo First Hospital, Ningbo, China.
Salman A AlQahtani

Research Chair of Pervasive and Mobile Computing, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11574, Saudi Arabia.
Min Chen

School of Computer Science and TechnologyHuazhong University of Science and Technology Wuhan 430074 China.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40553665)

Multimodal Depression Recognition via Mutual Information Maximization Joint with Multi-task Learning.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Multimodal Depression Recognition via Mutual Information Maximization Joint with Multi-task Learning.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals