Identifying human activities causing water pollution based on microbial community sequencing and source classifier machine learning.

Journal: Environment international
PMID:

Abstract

Identifying and differentiating human activities is crucial for effectively preventing the threats posed by environmental pollution to aquatic ecosystems and human health. Machine learning (ML) is a powerful analytical tool for tracking human impacts on river ecosystems based on high-through datasets. This study employed an ML framework and 16S rRNA sequencing data to reveal microbial dynamics and trace human activities across China. The results revealed that the microbial assembly was mainly dominated by deterministic factors (environmental factors and interactions between species), and the metacommunity partition was significantly associated with human activities in both water and sediment (Chi-square testP = 1.93 × 10; Chi-square testP = 6.00 × 10). Human activities increased the vulnerability of interspecific occurrence networks and the influence of environmental factors on the OTUs similarity and phylogenetic distance. Combined of microbiological indices (MBIs), microbial relative abundance (MRA), and environmental and geographical indices (EGIs), the source classifier machine learning (SCML) algorithm was used to categorize five human activities (i.e., low human-impact, agricultural inputs, domestic inputs, industrial inputs, and dam construction). The SCML optimal configuration is (MBIs + MRA + EGIs) exhibited strong performance with Test R of 0.882 and Test R of 0.924. This study provides valuable insights for improving ecosystem management, supporting sustainable water resource management and advancing pollution mitigation efforts.

Authors

  • Zhangmu Jing
    State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Science, Beijing 100012, China; State Environmental Protection Key Laboratory of Estuarine and Coastal Environment, Chinese Research Academy of Environmental Science, Beijing 100012, China; State Key Laboratory of Pollution Control and Resource Reuse, College of Environmental Science and Engineering, Tongji University, Shanghai, 200092, China; School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore.
  • Yi Zhang
    Department of Thyroid Surgery, China-Japan Union Hospital of Jilin University, Jilin University, Changchun, China.
  • Xiaoling Liu
    Department of Endocrinology, Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, China.
  • Qingqian Li
    State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Science, Beijing 100012, China; State Environmental Protection Key Laboratory of Estuarine and Coastal Environment, Chinese Research Academy of Environmental Science, Beijing 100012, China.
  • Yanji Hao
    State Key Laboratory of Heavy Oil Processing, Beijing Key Laboratory of Biogas Upgrading Utilization, College of New Energy and Materials, China University of Petroleum Beijing (CUPB), Beijing, 102249, China.
  • Yeqing Li
    State Key Laboratory of Heavy Oil Processing, Beijing Key Laboratory of Biogas Upgrading Utilization, College of New Energy and Materials, China University of Petroleum Beijing (CUPB), Beijing 102249, PR China. Electronic address: liyeqingcup@126.com.
  • Hongjie Gao
    State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Science, Beijing 100012, China; State Environmental Protection Key Laboratory of Estuarine and Coastal Environment, Chinese Research Academy of Environmental Science, Beijing 100012, China; State Key Laboratory of Pollution Control and Resource Reuse, College of Environmental Science and Engineering, Tongji University, Shanghai, 200092, China. Electronic address: gaohj@craes.org.cn.