A holistic air monitoring dataset with complaints and POIs for anomaly detection and interpretability tracing.
Journal:
Scientific data
Published Date:
Jul 24, 2025
Abstract
Urban air pollution poses a global health risk. This study presents the Airware-Haikou dataset, a robust resource for urban air pollution research, integrating multivariate time-series air quality monitoring data (MTSAM), Point of Interest (POI) data, and a public complaint corpus. The MTSAM, collected from 95 monitoring stations in Haikou, China, includes hourly measurements of six air pollutants and five meteorological factors. The data underwent rigorous pre-processing, including spatial-temporal interpolation and rebalancing, to ensure consistency and reliability. Using POI data and monitoring station coordinates, the MTSAM was segmented into four spatial-temporal subsets via cluster analysis, enabling detailed characterization of air quality dynamics. The public complaint corpus, extracted from the UIE model, serves as a baseline for post hoc interpretation of deep learning models, linking public sentiment with empirical air quality data. The Airware-Haikou dataset offers a comprehensive foundation for urban air pollution studies, while its validation model, DsRL-Net, significantly enhances the accuracy and reliability of pollution detection, advancing research in this critical field.
Authors
Keywords
No keywords available for this article.