A hybrid deep learning model for sentiment analysis of COVID-19 tweets with class balancing.
Journal:
Scientific reports
Published Date:
Jul 30, 2025
Abstract
The widespread dissemination of misinformation and the diverse public sentiment observed during the COVID-19 pandemic highlight the necessity for accurate sentiment analysis of social media discourse. This study proposes a hybrid deep learning (DL) model that integrates Bidirectional Encoder Representations from Transformers (BERT) for contextual feature extraction with Long Short-Term Memory (LSTM) networks for sequential learning to classify COVID-19-related sentiments. To enhance data quality, advanced text preprocessing techniques, including Unicode normalization, contraction expansion, and emoji conversion, are applied. Additionally, to mitigate class imbalance, Random OverSampling (ROS) is employed, leading to significant improvements in model performance. Before applying ROS, the model exhibited lower accuracy and inconsistent performance across sentiment categories. After balancing the dataset, accuracy for binary classification increased to 92.10%, with corresponding precision, sensitivity, and specificity of 92.10%, 92.10%, and 91.50%, respectively. For three-class sentiment classification, accuracy improved to 89.47%, with precision, sensitivity, and specificity of 89.80%, 89.47%, and 94.10%, respectively. In five-class sentiment classification, accuracy reached 81.78%, with precision, sensitivity, and specificity of 82.19%, 81.78%, and 95.28%, respectively. These findings demonstrate the efficacy of combining deep learning-based sentiment analysis with advanced text preprocessing and class balancing techniques for accurately classifying public sentiment related to COVID-19 across multiple sentiment categories.