Methodological Integration of Machine Learning and Geospatial Analysis for PM Pollution Mapping.

Journal: MethodsX
Published Date:

Abstract

Air pollution mitigation necessitates accurate spatial modelling to inform public health interventions. Traditional approaches inadequately capture complex predictor-pollutant interactions, whereas machine learning (ML) offers a superior capacity for modelling nonlinear relationships. This study compares three ML Random Forest (RF), K-Nearest Neighbors (KNN), and Naïve Bayes (NB) algorithms using annual PM data from 11 monitoring stations alongside atmospheric, urban, and terrain covariates. The methodological framework employed rigorous preprocessing and cross-validation to classify pollution into three categorical levels. Results demonstrate RF superior performance, achieving 94% balanced accuracy and 97% specificity, significantly outperforming KNN (92%) and NB (89%). RF excelled in capturing spatial heterogeneity and complex variable interactions, while KNN and NB exhibited limitations in managing feature dependencies and localized variability. Despite computational demands, findings substantiate RF reliability for robust air quality monitoring applications. The study contributes valuable insights for implementing scalable pollution prediction systems in resource-constrained urban environments while acknowledging interpretability challenges inherent to complex ML models.•Preprocessing of spatial data from various sources, incorporating the handling of missing/abnormal data, analysis, and normalization•Implementation of the three ML algorithms with rigorous hyperparameter tuning, model validation, and performance assessment•Mapping PM Hotspots on the Gradient Direction and Distance from the City Center.

Authors

  • Kalid Hassen Yasin
    Geo-Information Science Program, School of Geography and Environmental Studies, Haramaya University, P.O. Box 138, 3220 Dire Dawa, Ethiopia.
  • Muaz Ismael Yasin
    School of Medicine, College of Health and Medical Sciences, Haramaya University, P.O. Box 235, Harar, Ethiopia.
  • Anteneh Derribew Iguala
    Center for Rural Development, Oromia State University, Batu P.O. Box 209, Ethiopia.
  • Tadele Bedo Gelete
    Geo-Information Science Program, School of Geography and Environmental Studies, Haramaya University, P.O. Box 138, 3220 Dire Dawa, Ethiopia.
  • Erana Kebede
    School of Plant Sciences, College of Agriculture and Environmental Sciences, Haramaya University, P.O. Box 138, Dire Dawa, Ethiopia.

Keywords

No keywords available for this article.