Learnable Group Transform: Enhancing Genotype-to-Phenotype Prediction for Rice Breeding with Small, Structured Datasets
Journal:
arXiv
Published Date:
Mar 14, 2025
Abstract
Genotype-to-Phenotype (G2P) prediction plays a pivotal role in crop breeding,
enabling the identification of superior genotypes based on genomic data. Rice
(Oryza sativa), one of the most important staple crops, faces challenges in
improving yield and resilience due to the complex genetic architecture of
agronomic traits and the limited sample size in breeding datasets. Current G2P
prediction methods, such as GWAS and linear models, often fail to capture
complex non-linear relationships between genotypes and phenotypes, leading to
suboptimal prediction accuracy. Additionally, population stratification and
overfitting are significant obstacles when models are applied to small datasets
with diverse genetic backgrounds. This study introduces the Learnable Group
Transform (LGT) method, which aims to overcome these challenges by combining
the advantages of traditional linear models with advanced machine learning
techniques. LGT utilizes a group-based transformation of genotype data to
capture spatial relationships and genetic structures across diverse rice
populations, offering flexibility to generalize even with limited data. Through
extensive experiments on the Rice529 dataset, a panel of 529 rice accessions,
LGT demonstrated substantial improvements in prediction accuracy for multiple
agronomic traits, including yield and plant height, compared to
state-of-the-art baselines such as linear models and recent deep learning
approaches. Notably, LGT achieved an R^2 improvement of up to 15\% for yield
prediction, significantly reducing error and demonstrating its ability to
extract meaningful signals from high-dimensional, noisy genomic data. These
results highlight the potential of LGT as a powerful tool for genomic
prediction in rice breeding, offering a promising solution for accelerating the
identification of high-yielding and resilient rice varieties.