Biological Prior Knowledge-Embedded Deep Neural Network for Plant Genomic Prediction.
Journal:
Genes
PMID:
40282370
Abstract
Genomic prediction is a powerful approach that predicts phenotypic traits from genotypic information, enabling the acceleration of trait improvement in plant breeding. Traditional genomic prediction methods have primarily relied on linear mixed models, such as Genomic Best Linear Unbiased Prediction (GBLUP), and conventional machine learning methods like Support Vector Regression (SVR). Traditional methods are limited in handling high-dimensional data and nonlinear relationships. Thus, deep learning methods have also been applied to genomic prediction in recent years. We proposed iADEP, Integrated Additive, Dominant, and Epistatic Prediction model based on deep learning. Specifically, single nucleotide polymorphism (SNP) data integrating latent genetic interactions and genome-wide association study results as biological prior knowledge are fused to an SNP embedding block, which is then input to a local encoder. The local encoder is fused with an omic-data-incorporated global decoder through a multi-head attention mechanism, followed by multilayer perceptrons. : Firstly, we demonstrated through experiments on four datasets that iADEP outperforms existing methods in genotype-to-phenotype prediction. Secondly, we validated the effectiveness of SNP embedding through ablation experiments. Third, we provided an available module for combining other omics data in iADEP and propose a novel method for fusing them. Fourthly, we explored the impact of feature selection on iADEP performance and conclude that utilizing the full set of SNPs generally provides optimal results. Finally, by altering the partition of training and testing sets, we investigated the differences between transductive learning and inductive learning. iADEP provides a new approach for AI breeding, a promising method that integrates biological prior knowledge and enables combination with other omics data.