Leveraging weighted embedding and Transformer architecture to improve phenotype prediction of complex traits for crops.

Journal: Nature communications
Published Date:

Abstract

Understanding the relationship between genomic variation and phenotype is fundamental to deciphering the genetic architecture underlying complex traits. Yet, existing statistical models struggle to balance massive genomic datasets with biological interpretability. Here, we introduce GP-WAITER, a deep learning framework integrating GWAS-derived SNP weights into a hybrid convolutional neural network and Transformer architecture. By utilizing a weighted embedding mechanism and multi-head self-attention, GP-WAITER effectively captures long-range dependencies across ultra-long genomic sequences. The model consistently outperforms seven state-of-the-art genomic prediction models across six datasets, achieving up to a 77.5% improvement in prediction accuracy, a 78% reduction in mean squared error, and a 1.8-2.4fold increase in computational efficiency. Furthermore, GP-WAITER offers biological transparency by pinpointing key genetic variants driving specific traits. This scalable, interpretable framework provides a powerful tool for precision breeding and the functional interpretation of trait-associated variants.

Authors

Keywords

No keywords available for this article.