Applying machine learning to forecast daily Ambrosia pollen using environmental and NEXRAD parameters.
Journal:
Environmental monitoring and assessment
PMID:
31254085
Abstract
Approximately 50 million Americans have allergic diseases. Airborne plant pollen is a significant trigger for several of these allergic diseases. Ambrosia (ragweed) is known for its abundant production of pollen and its potent allergic effect in North America. Hence, estimating and predicting the daily atmospheric concentration of pollen (ragweed pollen in particular) is useful for both people with allergies and for the health professionals who care for them. In this study, we show that a suite of variables including meteorological and land surface parameters, as well as next-generation radar (NEXRAD) measurements together with machine learning can be used to estimate successfully the daily pollen concentration. The supervised machine learning approaches we used included random forests, neural networks, and support vector machines. The performance of the training is independently validated using 10% of the data partitioned using the holdout cross-validation method from the original dataset. The random forests (R= 0.61, R= 0.37), support vector machines (R= 0.51, R= 0.26), and neural networks (R= 0.46, R= 0.21) effectively predicted the daily Ambrosia pollen, where the correlation coefficient (R) and R-squared (R) values are given in brackets. Three independent approaches-the random forests, correlation coefficients, and interaction information-were employed to rank the relative importance of the available predictors.