Model-to-crop conserved NUE Regulons enhance machine learning predictions of nitrogen use efficiency.

Journal: The Plant cell
Published Date:

Abstract

Systems biology aims to uncover gene regulatory networks (GRNs) for agricultural traits, but validating them in crops is challenging. We addressed this challenge by learning and validating model-to-crop transcription factor (TF) regulons governing nitrogen use efficiency (NUE). First, a fine-scale time-course nitrogen (N) response transcriptome analysis revealed a conserved temporal N response cascade in maize (Zea mays) and Arabidopsis (Arabidopsis thaliana). These data were used to infer time-based causal TF target edges in N-regulated GRNs. By validating 23 maize TFs in a cell-based TF-perturbation assay (Transient Assay Reporting Genome-wide Effects of Transcription factors), precision/recall analysis enabled us to prune high-confidence edges between ∼200 TFs/700 maize target genes. We next learned gene-to-NUE trait scores using XGBoost machine learning models trained on conserved N-responsive genes across maize and Arabidopsis accessions. By integrating NUE gene scores within our N-GRN, we ranked maize TFs based on a cumulative NUE Regulon score. NUE Regulons for top-ranked TFs were validated using the cell-based TARGET assay in maize (e.g. ZmMYB34/R3→24 targets) and the Arabidopsis ZmMYB34/R3 ortholog (e.g. AtDIV1→23 targets). The genes in this NUE Regulon significantly enhanced the ability of XGBoost models to predict NUE traits in both maize and Arabidopsis. Thus, our pipeline for identifying TF regulons that combines GRN inference, machine learning, and orthologous network regulons offers a strategic framework for crop trait improvement.

Authors

  • Ji Huang
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA.
  • Chia-Yi Cheng
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA.
  • Matthew D Brooks
    Global Change and Photosynthesis Research Unit, USDA-ARS, Urbana, IL 61801, USA.
  • Tim L Jeffers
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
  • Nathan M Doner
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
  • Hung-Jui Shih
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
  • Samantha Frangos
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
  • Manpreet Singh Katari
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
  • Gloria M Coruzzi
    Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA. gloria.coruzzi@nyu.edu.