Speech emotion recognition based on a stacked autoencoders optimized by PSO based grass fibrous root optimization.

Journal: Scientific reports

Published Date: Jul 18, 2025

Abstract

Effective speech emotion recognition (SER) poses a significant challenge due to the intricate and subjective nature of human emotions. Recognizing emotional states accurately from speech signals has a broad spectrum of practical applications, such as healthcare, human-computer interaction, and social robotics. This study introduces an innovative approach that merges deep learning with metaheuristic algorithms to boost the efficiency of SER systems. Specifically, a stacked autoencoder (SAE) serves as the primary model, and its performance is fine-tuned using a nature-inspired hybrid algorithm that combines particle swarm optimization (PSO) with Grass Fibrous Root Optimization (GFRO). The proposed model adeptly extracts spectral and pitch features from speech signals, encompassing spectral crest, spectral entropy, spectral flux, and harmonic ratio, to capture emotional cues effectively. The model's performance is evaluated on a standard emotion recognition dataset, comparing with some state-of-the-art models, including Convolutional Neural Network (CNN), Support Vector Machine (SVM), Deep Learning (DL), CNN and Iterative Neighborhood Component Analysis (CNN/INCA), VGG-16 achieving high accuracy in identifying various emotional states.

Authors

Chi Zeng

Xinyang Vocational and Technical College, Xinyang, 464000, Henan, China.
Jialing Li

Business School, Sichuan University, Chengdu, China.
Abbas Habibi

University of Tehran, Tehran, Iran. abbashabibi208@gmail.com.

Keywords

Algorithms Autoencoder Deep Learning Emotions Humans Neural Networks, Computer Speech Support Vector Machine

External Resources

View on PubMed Access via DOI PubMed (40681606)

Speech emotion recognition based on a stacked autoencoders optimized by PSO based grass fibrous root optimization.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Speech emotion recognition based on a stacked autoencoders optimized by PSO based grass fibrous root optimization.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals