Multimodal deep learning methods enhance genomic prediction of wheat breeding.

Journal: G3 (Bethesda, Md.)
PMID:

Abstract

While several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes in plant breeding research, few methods have linked genomics and phenomics (imaging). Deep learning (DL) neural networks have been developed to increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype-environment interaction (GE); however, unlike conventional GP models, DL has not been investigated for when genomics is linked with phenomics. In this study we used 2 wheat data sets (DS1 and DS2) to compare a novel DL method with conventional GP models. Models fitted for DS1 were GBLUP, gradient boosting machine (GBM), support vector regression (SVR) and the DL method. Results indicated that for 1 year, DL provided better GP accuracy than results obtained by the other models. However, GP accuracy obtained for other years indicated that the GBLUP model was slightly superior to the DL. DS2 is comprised only of genomic data from wheat lines tested for 3 years, 2 environments (drought and irrigated) and 2-4 traits. DS2 results showed that when predicting the irrigated environment with the drought environment, DL had higher accuracy than the GBLUP model in all analyzed traits and years. When predicting drought environment with information on the irrigated environment, the DL model and GBLUP model had similar accuracy. The DL method used in this study is novel and presents a strong degree of generalization as several modules can potentially be incorporated and concatenated to produce an output for a multi-input data structure.

Authors

  • Abelardo Montesinos-López
    Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430, Guadalajara, Jalisco, México.
  • Carolina Rivera
    International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico.
  • Francisco Pinto
    International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico.
  • Francisco Piñera
    International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico.
  • David González
    Aragon Institute of Engineering Research, Universidad de Zaragoza, Zaragoza, Spain.
  • Mathew Reynolds
    International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico.
  • Paulino Pérez-Rodríguez
    Colegio de Postgraduados, Campus Montecillo, Texcoco, México, 056230, México. perpdgo@gmail.com.
  • Huihui Li
    School of Computer Science and Engineering, South China University of Technology, Guangzhou 510000, China. Electronic address: 29777562@qq.com.
  • Osval A Montesinos-López
    Facultad de Telemática oamontes1@ucol.mx j.crossa@cgiar.org.
  • José Crossa
    Biometrics and Statistics Unit (BSU), International Maize and Wheat Improvement Center (CIMMYT), Apdo Postal 6-641, México DF, 06600 24105, México. j.crossa@cgiar.org.