A Multimodal Emotion Recognition System: Integrating Facial Expressions, Body Movement, Speech, and Spoken Language
Journal:
arXiv
Published Date:
Dec 23, 2024
Abstract
Traditional psychological evaluations rely heavily on human observation and
interpretation, which are prone to subjectivity, bias, fatigue, and
inconsistency. To address these limitations, this work presents a multimodal
emotion recognition system that provides a standardised, objective, and
data-driven tool to support evaluators, such as psychologists, psychiatrists,
and clinicians. The system integrates recognition of facial expressions,
speech, spoken language, and body movement analysis to capture subtle emotional
cues that are often overlooked in human evaluations. By combining these
modalities, the system provides more robust and comprehensive emotional state
assessment, reducing the risk of mis- and overdiagnosis. Preliminary testing in
a simulated real-world condition demonstrates the system's potential to provide
reliable emotional insights to improve the diagnostic accuracy. This work
highlights the promise of automated multimodal analysis as a valuable
complement to traditional psychological evaluation practices, with applications
in clinical and therapeutic settings.