BanglaNewsClassifier: A machine learning approach for news classification in Bangla Newspapers using hybrid stacking classifiers.
Journal:
PloS one
Published Date:
Jun 9, 2025
Abstract
Bangla news floods the web, and the need for smarter and more efficient classification techniques is greater than ever. Previous studies mostly focused on traditional models, overlooking the potential of hybrid techniques to handle the ever-growing complex dataset and its linguistic patterns in Bangla to achieve higher accuracy. Addressing the challenge, this study presents a comprehensive approach to classify Bangla news articles into eight distinct categories using various machine learning and deep learning techniques. The use of traditional machine learning algorithms, deep learning architectures, and hybrid models, including novel stacking classifiers, was a part of our experiment. This study utilized a dataset of 118,404 Bangla news articles, applying rigorous feature extraction techniques including TF-IDF vectorization and word2Vec embeddings. Our best-performing model, a stacking meta-classifier combining bidirectional long short-term memory and support vector machine, achieved a remarkable 94% accuracy, leaving all basic models' performance behind. Also, we provided an in-depth analysis of model performances, including confusion matrices, ROC curves, and error analysis, offering insights into the strengths and limitations of each approach. This research contributes significantly to the field of Bangla natural language processing and demonstrates the efficacy of ensemble methods and deep learning in news classification for low-resource languages.