Advancing micro-nano supramolecular assembly mechanisms of natural organic matter by machine learning for unveiling environmental geochemical processes.

Journal: Environmental science. Processes & impacts
PMID:

Abstract

The nano-self-assembly of natural organic matter (NOM) profoundly influences the occurrence and fate of NOM and pollutants in large-scale complex environments. Machine learning (ML) offers a promising and robust tool for interpreting and predicting the processes, structures and environmental effects of NOM self-assembly. This review seeks to provide a tutorial-like compilation of data source determination, algorithm selection, model construction, interpretability analyses, applications and challenges for big-data-based ML aiming at elucidating NOM self-assembly mechanisms in environments. The results from advanced nano-submicron-scale spatial chemical analytical technologies are suggested as input data which provide the combined information of molecular interactions and structural visualization. The existing ML algorithms need to handle multi-scale and multi-modal data, necessitating the development of new algorithmic frameworks. Interpretable supervised models are crucial owing to their strong capacity of quantifying the structure-property-effect relationships and bridging the gap between simply data-driven ML and complicated NOM assembly practice. Then, the necessity and challenges are discussed and emphasized on adopting ML to understand the geochemical behaviors and bioavailability of pollutants as well as the elemental cycling processes in environments resulting from the NOM self-assembly patterns. Finally, a research framework integrating ML, experiments and theoretical simulation is proposed for comprehensively and efficiently understanding the NOM self-assembly-involved environmental issues.

Authors

  • Ming Zhang
    Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine, College of Veterinary Medicine, Harbin 150030, China.
  • Yihui Deng
    College of Environment, Zhejiang University of Technology, Hangzhou, 310014, P. R. China. panxl@zjut.edu.cn.
  • Qianwei Zhou
    College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China.
  • Jing Gao
    Department of Gastroenterology 3, Hubei University of Medicine, Renmin Hospital, Shiyan, Hubei, China.
  • Daoyong Zhang
    College of Geoinformatics, Zhejiang University of Technology, Hangzhou, 310014, P. R. China. zhangdaoyong@zjut.edu.cn.
  • Xiangliang Pan
    College of Environment, Zhejiang University of Technology, Hangzhou, 310014, P. R. China. panxl@zjut.edu.cn.