UrduSER: A comprehensive dataset for speech emotion recognition in Urdu language.

Journal: Data in brief

Published Date: Jun 1, 2025

Abstract

Speech Emotion Recognition (SER) is a rapidly evolving field of research that aims to identify and categorize emotional states through speech signal analysis. As SER holds considerable socio¬cultural and business significance, researchers are increasingly exploring machine learning and deep learning techniques to advance this technology. A well-suited dataset is a crucial resource for SER studies in a specific language. However, despite being the 10th most spoken language globally, Urdu lacks SER datasets, creating a significant research gap. The available Urdu SER datasets are insufficient due to their limited scope, including a narrow range of emotions, small datasets, and a limited number of dialogs, which restricts their usability in real-world scenarios. To fill the gap in existing Urdu speech datasets, an Urdu Speech Emotion Recognition Dataset (UrduSER) is developed. This comprehensive dataset consists of 3500 speech signals from 10 professional actors, with a balanced mix of males and females, and diverse age ranges. The speech signals were sourced from a vast collection of Pakistani Urdu drama serials and telefilms available on YouTube. Seven emotional states are covered in the dataset: Angry, Fear, Boredom, Disgust, Happy, Neutral, and Sad. A notable strength of this dataset is the diversity of the dialogs, with each utterance containing almost unique content, in contrast to existing datasets that often feature repetitive samples of predefined dialogs spoken by research volunteers in a laboratory environment. To ensure balance and symmetry, the dataset consists of 500 samples for each emotional class, with 50 samples per actor per emotion. An accompanying Excel file provides a detailed metadata index for each audio sample, including file name, duration, format, sample rate, actor information, emotional state, and the Urdu dialogue script. This comprehensive metadata index enables researchers and developers to efficiently access, organize, and utilize the UrduSER dataset. The UrduSER dataset underwent a rigorous validation process, integrating expert validation to confirm its validity, reliability, and overall suitability for research and development purposes.

Authors

Muhammad Zaheer Akhtar

Department of Information Technology, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan.
Rashid Jahangir

Department of Computer Science, COMSATS University Islamabad, Vehari Campus 61100, Pakistan.
Quratul Ain

Department of Chemistry, Government College Women University Faisalabad, 03822, Pakistan. Electronic address: chemistquainhawk@gmail.com.
Muhammad Asif Nauman

Riphah School of Computing & Innovation, Riphah International University, Lahore, Pakistan.
Mueen Uddin

Department of Information Systems, Faculty of Engineering, Effat University, Jeddah, Saudi Arabia.
Syed Sajid Ullah

Department of Information and Communication Technology, University of Agder, Kristiansand, Norway.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40496743)

UrduSER: A comprehensive dataset for speech emotion recognition in Urdu language.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

UrduSER: A comprehensive dataset for speech emotion recognition in Urdu language.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals