SapFlower: an automated tool for sap flow data preprocessing, gap-filling, and analysis using deep learning.

Journal: The New phytologist
PMID:

Abstract

Sap flow, a critical process in plant water use and ecosystem water cycles, is often measured using thermal dissipation probes (TDP) due to their ease of installation and continuous data collection. However, sap flow data frequently include noise, outliers, and gaps, creating challenges for analysis and requiring substantial manual processing. We developed SapFlower, a tool that automates data preprocessing, model training, gap-filling, sapwood area scaling and modeling, and water use analysis. It integrates autocleaning, machine learning and deep learning models (e.g. random forest, Gaussian process regression, long short-term memory (LSTM), bidirectional LSTM (BiLSTM)), and efficient workflows to process sap flow data. SapFlower can remove over 90% of noisy data while preserving legitimate variations and achieve high accuracy in gap-filling based on user-determined parameters. Random forest, LSTM, and BiLSTM models reduced root mean square error to 10% or less for long-term gaps. Model training and prediction can be performed efficiently within seconds. SapFlower significantly enhances the efficiency and accessibility of TDP data analysis by automating complex tasks, enabling researchers without programming expertise to employ advanced techniques. Future improvements will focus on species-specific corrections for TDP and support for additional measurement methods. SapFlower is openly available on GitHub (https://github.com/JiaxinWang123/SapFlower) and Zenodo (doi: 10.5281/zenodo.13665919).

Authors

  • Jiaxin Wang
    Faculty of Engineering, University of Toyama, Toyama-Shi 930-8555, Japan.
  • Heidi J Renninger
    Department of Forestry, Mississippi State University, Mississippi State, MS, 39762, USA.