Multilingual identification of nuanced dimensions of hope speech in social media texts.
Journal:
Scientific reports
Published Date:
Jul 23, 2025
Abstract
Hope plays a crucial role in human psychology and well-being, yet its expression and detection across languages remain underexplored in natural language processing (NLP). This study presents MIND-HOPE, the first-ever multiclass hope speech detection datasets for Spanish and German, collected from Twitter. The annotated dataset comprise 19,183 Spanish tweets and 21,043 German tweets, categorized into four classes: Generalized Hope, Realistic Hope, Unrealistic Hope, and Not Hope. The paper also provides a comprehensive review of existing hope speech datasets and detection techniques, and conducts a comparative evaluation of traditional machine learning, deep learning, and transformer-based approaches. Experimental results, obtained using 5-fold cross-validation, show that monolingual transformer models (e.g., bert-base-german-dbmdz-uncased and bert-base-spanish-wwm-uncased) consistently outperform multilingual models (e.g., mBERT, XLM-RoBERTa) in both binary and multiclass hope detection tasks. These findings underscore the value of language-specific fine-tuning for nuanced affective computing tasks. This study advances sentiment analysis by addressing a novel and underrepresented affective dimension-hope, and proposes robust multilingual benchmarks for future research. Theoretically, it contributes to a deeper understanding of hope as a complex emotional state with practical implications for mental health monitoring, social well-being analysis, and positive content recommendation in online spaces. By modeling hope across languages and categories, this research opens new directions in affective NLP and cross-cultural computational social science.