AdaptiveGS: an explainable genomic selection framework based on adaptive stacking ensemble machine learning.
Journal:
TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik
Published Date:
Aug 7, 2025
Abstract
We developed an adaptive and unified stacking genomic selection framework and designed a model interpretation strategy to identify the candidate significant SNPs of target traits. Genomic selection (GS) is an important technique in modern molecular breeding. As a powerful machine learning (ML) GS approach, stacking ensemble learning (SEL) combines multiple basic models (base learners, BLs) and effectively blends the strengths of different models to precisely depict the complex relationships between phenotypes and genotypes. However, in the key step of the SEL, there is currently a lack of an effective and unified framework for the selection of BLs. We developed adaptiveGS, an adaptive and explainable data-driven BLs selection strategy for the first time, to pre-screen the optimal BLs for stacking GS framework and improve the prediction accuracy. The adaptiveGS is performed based on the PR index, leveraging the Pearson correlation coefficient (PCC) and the normalized root mean square error (NRMSE), and the top 3 out of 7 (or self-setting) ML are tailored to be BLs via the PR index. We compared the adaptiveGS with 13 other GS algorithms based on a total of 21 traits (datasets) from 4 species. The results showed that adaptiveGS outperformed the 13 models on most of the 21 traits, with the average prediction accuracy (PCC) reaching 0.703, an average improvement of 14.4%, demonstrating superior predictive accuracy and robustness. Furthermore, the SHapley Additive exPlanations (SHAP) technique was utilized to interpret the adaptiveGS and identify significant SNPs for trait variations and potential interaction effects between SNPs. The adaptiveGS provides an operable and unified solution for stacking GS users to improve prediction accuracy in the breeding field. The adaptiveGS package is accessible at https://github.com/yangzhen0117/adaptiveGS .