Context-Aware Deep Learning for Multi Modal Depression Detection
Journal:
arXiv
Published Date:
Dec 26, 2024
Abstract
In this study, we focus on automated approaches to detect depression from
clinical interviews using multi-modal machine learning (ML). Our approach
differentiates from other successful ML methods such as context-aware analysis
through feature engineering and end-to-end deep neural networks for depression
detection utilizing the Distress Analysis Interview Corpus. We propose a novel
method that incorporates: (1) pre-trained Transformer combined with data
augmentation based on topic modelling for textual data; and (2) deep 1D
convolutional neural network (CNN) for acoustic feature modeling. The
simulation results demonstrate the effectiveness of the proposed method for
training multi-modal deep learning models. Our deep 1D CNN and Transformer
models achieved state-of-the-art performance for audio and text modalities
respectively. Combining them in a multi-modal framework also outperforms
state-of-the-art for the combined setting. Code available at
https://github.com/genandlam/multi-modal-depression-detection