Predicting nepetalactone accumulation in Nepeta persica using machine learning algorithms and geospatial analysis.
Journal:
Scientific reports
Published Date:
Aug 27, 2025
Abstract
Nepeta persica is a medicinal plant with significant pharmacological potential, primarily attributed to its high nepetalactone content. Understanding the environmental drivers of nepetalactone biosynthesis is essential for optimizing both cultivation and conservation strategies. In this study, we combined machine learning algorithms (random forest, support vector machines, gradient boosting machines) with a hybrid ensemble model (RF-SVM-GBM), alongside statistical approaches (generalized linear models [GLM] and partial least squares [PLS]) and geospatial analyses (GIS, remote sensing, habitat suitability modeling) to assess the influence of climatic, topographic, and edaphic factors on nepetalactone concentration in N. persica across Fars province, Iran. The results identified elevation, south-facing slopes, and mean annual temperature as the most critical determinants of nepetalactone accumulation. The hybrid ensemble model demonstrated the highest predictive accuracy, reducing RMSE by 21.1% (RMSE = 0.015) compared to individual models. Habitat suitability maps revealed Marvdasht and Shiraz counties as the most favorable regions for cultivating N. persica with high nepetalactone concentrations, followed by smaller high-suitability zones in Northeast Firozabad and Northern Kazerun. In contrast, areas such as Abadeh, Eqlid, and Khorrambid exhibited lower suitability. These findings provide actionable insights for precision agriculture, resource-efficient cultivation, and climate-adaptive conservation of medicinal plants. By integrating ecological modeling with machine learning, this research offers a scalable, data-driven framework to support the sustainable production of high-value secondary metabolites in environmentally challenging regions.