Predicting surface soil pH spatial distribution based on three machine learning methods: a case study of Heilongjiang Province.

Journal: Environmental monitoring and assessment
PMID:

Abstract

Comprehensive and accurate acquisition of surface soil pH spatial distribution information is essential for monitoring soil degradation and providing scientific guidance for agricultural practices. This study focused on Heilongjiang Province in China, utilizing data from 125 soil survey sampling points. Key environmental covariates were identified as modeling inputs through Pearson correlation analysis and recursive feature elimination (RFE). Three machine learning models-support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost)-were employed to predict surface soil pH in the study area. The modeling outcomes and distinctions among these models were then thoroughly compared. The results showed that the mean monthly temperature maximum (MMTmax), mean monthly precipitation minimum (MMPmin), mean annual precipitation (MAP), drought index (DI), and mean monthly wind speed maximum (MMWSmax) were the most important environmental covariates for modeling. Climate variables are better suited to reflect the nonlinear relationships between soil properties and the environment in large and flat areas during mapping. Among the mapping models, XGBoost exhibited the highest prediction performance (R =0.705, RMSE=0.633, MAE=0.484), followed by RF (R =0.688, RMSE=0.656, MAE=0.497), while SVM was considered unstable in this study. For uncertainty maps, XGBoost demonstrated lower uncertainty primarily in high-altitude mountainous forest regions, whereas RF achieved higher prediction consistency mainly in low-altitude plain areas. Each prediction model had its advantages in different terrain regions, yet XGBoost was regarded as the optimal model. According to the optimal model, the typical black soil in Heilongjiang Province generally exhibited weak acidity, with an average pH of 6.42, showing a gradual increasing trend from east to west and from north to south. Soil acidification mainly occurred in the meadow black soil and albic black soil regions of Heilongjiang Province's eastern and northeastern parts. It is imperative to rigorously control the application of nitrogen fertilizers and to focus on improving the soil's acid-base buffering capacity.

Authors

  • Pu Huang
    Department of Obstetrics & Gynecology, the First Affiliated Hospital of Xi'an Jiaotong University, Xian, Shaanxi, China.
  • Qing Huang
    Department of Environmental Health and Occupational Medicine,West China School of Public Health,Sichuan University,Chengdu 610041,China.
  • Jingtian Wang
    National Key Laboratory of Efficient Utilization of Arid and Semi-arid Arable Land in Northern China, Beijing, 100081, China.
  • Yuhan Shi
    Electrical and Computer Engineering Department, University of California San Diego, La Jolla, CA, USA.