Predicting stunting status among under five children in ethiopia using ensemblemachine learning algorithms.
Journal:
Scientific reports
Published Date:
Jul 31, 2025
Abstract
Childhood stunting is a persistent public health challenge in Ethiopia, significantly impacting children's physical growth, cognitive development, and overall well-being. This study overcame a key limitation in previous stunting prediction models by developing a multi-class classification model that predicts stunting severity (severe, moderate, normal) using Ethiopia's nationally representative EDHS data from 2011 to 2016. Secondary data from the 2011 and 2016 Ethiopian Demographic and Health Surveys (EDHS) were analyzed, comprising 18,451 instances with 28 features. Data preprocessing included handling missing values, duplicate removal, feature selection, and synthetic minority over-sampling technique (SMOTE) for class balancing, resulting in 33,495 instances with 18 selected features. Four ensemble machine learning algorithms Random Forest, AdaBoost, XGBoost, and CatBoost were implemented and evaluated based on accuracy, precision, recall, F1-score, and ROC-AUC. Among the models, Random Forest achieved the highest performance with an accuracy of 97.985%, precision of 97.986%, recall of 97.985%, F1-score of 97.954%, and ROC-AUC of 99.995%. The top risk factors contributing to stunting included child's age, maternal education level, birth order, household wealth index, mother's BMI, breastfeeding duration, and access to clean water and sanitation. This study demonstrates the effectiveness of machine learning in accurately predicting childhood stunting in Ethiopia. The findings provide critical insights for healthcare professionals and policymakers to implement targeted intervention strategies, ultimately reducing childhood stunting prevalence.