Enhancing meteorological data reliability: An explainable deep learning method for anomaly detection.

Journal: Journal of environmental management
PMID:

Abstract

Accurate meteorological observation data is of great importance to human production activities. Meteorological observation systems have been advancing toward automation, intelligence, and informatization. Yet, instrumental malfunctions and unstable sensor node resources could cause significant deviations of data from the actual characteristics it should reflect. To achieve greater data accuracy, early detections of data anomalies, continuous collections and timely transmissions of data are essential. While obvious anomalies can be readily identified, the detection of systematic and gradually emerging anomalies requires further analyses. This study develops an interpretable deep learning method based on an autoencoder (AE), SHapley Additive exPlanations (SHAP) and Bayesian optimization (BO), in order to facilitate prompt and accurate anomaly detections of meteorological observational data. The proposed method can be unfolded into four parts. Firstly, the AE performs anomaly detections based on multidimensional meteorological datasets by marking the data that shows significant reconstruction errors. Secondly, the model evaluates the importance of each meteorological element of the flagged data via SHapley Additive exPlanation (SHAP). Thirdly, a K-sigma based threshold automatic delineation method is employed to obtain reasonable anomaly thresholds that are subject to the data characteristics of different observation sites. Finally, the BO algorithm is adopted to fine-tune difficult hyperparameters, enhancing the model's structure and thus the accuracy of anomaly detection. The practical implication of the proposed model is to inform agricultural production, climate observation, and disaster prevention.

Authors

  • Zhongke Qu
    School of Human Settlements and Civil Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
  • Ruizhi Xiao
    Institute of Earth Environment, Chinese Academy of Sciences, Xi'an, 710000, China.
  • Ke Yang
    National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China.
  • Mingjuan Li
    Shaanxi Climate Center, Xi'an, 710000, China.
  • Xinyu Hu
    Huaxi MR Research Center (HMRRC) Department of Radiology, West China Hospital Sichuan University, Chengdu, 610041, China.
  • Zhichao Liu
    a Division of Bioinformatics and Biostatistics , National Center for Toxicological Research, U.S. Food and Drug Administration , Jefferson , AR , USA.
  • Xilian Luo
    School of Human Settlements and Civil Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
  • Zhaolin Gu
    School of Human Settlements and Civil Engineering, Xi'an Jiaotong University, Xi'an, 710049, China; Key Laboratory of Eco-Environment and Meteorology for the Qinling Mountains and Loess Plateau, China Meteorological Administration, Xi'an, 710000, China. Electronic address: guzhaoln@mail.xjtu.edu.cn.
  • Chengwei Li
    Department of Radiology, The Third People's Hospital of Chengdu, Chengdu, China.