Advancing low-cost air quality monitor calibration with machine learning methods.

Journal: Environmental pollution (Barking, Essex : 1987)
PMID:

Abstract

Low-cost monitors for measuring airborne contaminants have gained popularity due to their affordability, portability, and ease of use. However, they often exhibit significant biases compared to high-cost reference instruments. For optimal accuracy, these monitors require calibration and validation in their specific environment using expensive reference instruments, which are often scarce and costly. This study proposes machine-learning calibration methods that utilize a single high-cost instrument as an active reference to improve the accuracy of large networks of low-cost monitors. Three machine learning models-linear regression, random forest, and Gradient Boosting Regression (GBR)-were employed. The proposed approach was tested in a controlled chamber under two conditions: environmental simulations with salt- and dust-based aerosols and occupational settings using three electronic cigarette (ECIG) brands. The study involved thirty low-cost GeoAir2 monitors, divided into ten groups of three. Initially, all groups were collocated with a high-cost monitor using Aerosol A to develop prediction and regression models. These models, along with intrinsic error measurements from one group, were then applied to improve data accuracy for the remaining groups using Aerosol B. The results demonstrated substantial improvements in accuracy, with r values ranging from 0.91 to 1.00 and RMSE reductions of up to 88 %, depending on the model and aerosol type. GBR consistently provided the highest accuracy and performance, particularly for complex, nonlinear patterns, while linear regression offered a faster, computationally efficient alternative suitable for less demanding scenarios. Random forest models performed moderately well, balancing accuracy and complexity. These methods provide a scalable and cost-effective solution for deploying networked low-cost sensors. Further research is needed to validate these findings in outdoor environments with meteorological and spatial influences, and indoor occupational settings where humidity effects may play a role.

Authors

  • Sinan Sousan
    Department of Public Health, Brody School of Medicine, East Carolina University, Greenville, NC, 27858, USA; North Carolina Agromedicine Institute, Greenville, NC, 27858, USA; Center for Human Health and the Environment, NC State University, Raleigh, NC, USA. Electronic address: sousans18@ecu.edu.
  • Rui Wu
    School of Materials and Energy, University of Electronic Science and Technology of China, Chengdu, 611731, China.
  • Ciprian Popoviciu
    Department of Technology Systems, East Carolina University, Greenville, NC, USA.
  • Sarah Fresquez
    Department of Public Health, Brody School of Medicine, East Carolina University, Greenville, NC, 27858, USA.
  • Yoo Min Park
    Center for Nano Bio Development, National Nanofab Center (NNFC), Daejeon, 34141, Republic of Korea.