Data-driven and machine learning framework for Alfalfa yield response to long-term climate variability in the Kansas High Plains (USA).

Journal: Scientific reports
Published Date:

Abstract

Climate extremes and declining water availability in the U.S. High Plains threaten the long-term sustainability of alfalfa production. This study presents a machine learning (ML) based modeling framework to identify and evaluate key agro-climatic predictors of alfalfa (Medicago sativa L.) yield across nine agricultural districts in Kansas from 1981 to 2018. A novel contribution of this study is the integration of a high-dimensional climate predictor space, long-term historical records, and district-level stratification of irrigated and rainfed systems to systematically assess spatially varying climate-yield relationships. Using a high-dimensional feature space of 117 climate-derived variables from PRISM and yield data from USDA-NASS, a Minimum Redundancy Maximum Relevance (mRMR) algorithm was applied to rank predictors, followed by forward feature selection with three non-parametric regression models. Four spatial configurations were evaluated-statewide, district-specific, irrigated, and rainfed datasets. The Kansas High Plains region, with its strong west-to-east precipitation gradient, groundwater-dependent irrigation systems, and recurrent drought exposure, offers a globally relevant testbed for understanding crop-climate interactions in semi-arid environments. Model evaluation RMSE during feature selection ranged from 0.52 to 1.14 tons/acre. Dew point temperature in August and seasonal VPD metrics emerged as dominant predictors, indicating these variables are strongly associated with late-summer yield variability. Rainfed districts prioritized drought indicators (e.g., no precipitation days, VPD), while irrigated districts showed stronger associations with thermal and humidity metrics. By explicitly accounting for climate sensitivities across management regimes and spatial scales, this framework provides a transferable approach for identifying robust, region-specific climate drivers in high-dimensional agricultural datasets.

Authors

Keywords

No keywords available for this article.