Ensembles of Graph Neural Networks Supervised by Genotype-to-Phenotype Structures Improved Genomic Prediction Performance

Journal: bioRxiv
Published Date:

Abstract

Accurate selection of favourable crop genotypes has motivated the exploration of diverse prediction algorithms for crop breeding applications. One genomic prediction method that has not been fully explored is graph attention networks (GAT). By directly analysing graphical data with the attention mechanism, GAT can incorporate the genotype-to-phenotype (G2P) structure to regularise predictions. There are many possible graph structures informed by knowledge of trait genetic architecture. The graphical representation of the inferred trait genetic architecture can be used as prior knowledge for GAT to effectively learn key features of trait genetic architecture and enhance the prediction patterns. Here, we investigated whether incorporating prior knowledge into GAT improved performance compared to GAT models representing a continuum of G2P structures, ranging from infinitesimal to fully connected. Applying the Diversity Prediction Theorem, we also combined these diverse G2P structures into an ensemble of GAT genomic prediction models to integrate complementary strengths of multiple models. The results for flowering time traits in two maize nested association mapping datasets showed a lack of consistent performance improvement in the prior knowledge GAT model. However, consistent outperformance was observed for the ensemble of GAT models. Improved predictions from the ensemble model may be driven by its ability to capture a more complete representation of the trait genetic architecture through the integration of information from diverse G2P structures, as is proposed by the Diversity Prediction Theorem. Our results show that an ensemble of GAT models can enhance prediction performance.

Authors

  • Shunichiro Tomura; Owen Powell; Melanie J. Wilkinson; Mark Cooper