Remotely sensed estimates of long-term biochemical oxygen demand over Hong Kong marine waters using machine learning enhanced by imbalanced label optimisation.

Journal: The Science of the total environment
PMID:

Abstract

In many coastal cities around the world, continuing water degradation threatens the living environment of humans and aquatic organisms. To assess and control the water pollution situation, this study estimated the Biochemical Oxygen Demand (BOD) concentration of Hong Kong's marine waters using remote sensing and an improved machine learning (ML) method. The scheme was derived from four ML algorithms (RBF, SVR, RF, XGB) and calibrated using a large amount (N > 1000) of in-situ BOD data. Based on labeled datasets with different preprocessing, i.e., the original BOD, the log(BOD), and label distribution smoothing (LDS), three types of models were trained and evaluated. The results highlight the superior potential of the LDS-based model to improve BOD estimate by dealing with imbalanced training dataset. Additionally, XGB and RF outperformed RBF and SVR when the model was developed using log(BOD) or LDS(BOD). Over two decades, the BOD concentration of Hong Kong marine waters in the autumn (Sep. to Nov.) shows a downward trend, with significant decreases in Deep Bay, Western Buffer, Victoria Harbour, Eastern Buffer, Junk Bay, Port Shelter, and the Tolo Harbour and Channel. Principal component analysis revealed that nutrient levels emerged as the predominant factor in Victoria Harbour and the interior of Deep Bay, while chlorophyll-related and physical parameters were dominant in Southern, Mirs Bay, Northwestern, and the outlet of Deep Bay. LDS provides a new perspective to improve ML-based water quality estimation by alleviating the imbalance in the labeled dataset. Overall, the remotely sensed BOD can offer insight into the spatial-temporal distribution of organic matter in Hong Kong coastal waters and valuable guidance for the pollution control.

Authors

  • Yadong Zhou
    Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China.
  • Boayin He
    Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China. Electronic address: heby@apm.ac.cn.
  • Xiaoyu Cao
    School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China.
  • Yu Xiao
    Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, the Second School of Clinical Medicine, Southern Medical University, Guangzhou, China.
  • Qi Feng
    Panzhihua University, Panzhihua 617000, Sichuan, China.
  • Fan Yang
    School of Electrical Engineering and Automation, Jiangsu Normal University, Xuzhou, China.
  • Fei Xiao
    Peking University Fifth School of Clinical Medicine, Beijing, China.
  • Xueer Geng
    Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  • Yun Du
    School of Nursing and Rehabilitation Shandong University, Jinan, Shandong, China.