Emotional Vietnamese Speech-Based Depression Diagnosis Using Dynamic Attention Mechanism
Journal:
arXiv
Published Date:
Dec 11, 2024
Abstract
Major depressive disorder is a prevalent and serious mental health condition
that negatively impacts your emotions, thoughts, actions, and overall
perception of the world. It is complicated to determine whether a person is
depressed due to the symptoms of depression not apparent. However, their voice
can be one of the factor from which we can acknowledge signs of depression.
People who are depressed express discomfort, sadness and they may speak slowly,
trembly, and lose emotion in their voices. In this study, we proposed the
Dynamic Convolutional Block Attention Module (Dynamic-CBAM) to utilized with in
an Attention-GRU Network to classify the emotions by analyzing the audio signal
of humans. Based on the results, we can diagnose which patients are depressed
or prone to depression then so that treatment and prevention can be started as
soon as possible. The research delves into the intricate computational steps
involved in implementing a Attention-GRU deep learning architecture. Through
experimentation, the model has achieved an impressive recognition with
Unweighted Accuracy (UA) rate of 0.87 and 0.86 Weighted Accuracy (WA) rate and
F1 rate of 0.87 in the VNEMOS dataset. Training code is released in
https://github.com/fiyud/Emotional-Vietnamese-Speech-Based-Depression-Diagnosis-Using-Dynamic-Attention-Mechanism