Predictive Models of Genetic Redundancy in Arabidopsis thaliana.

Journal: Molecular biology and evolution
Published Date:

Abstract

Genetic redundancy refers to a situation where an individual with a loss-of-function mutation in one gene (single mutant) does not show an apparent phenotype until one or more paralogs are also knocked out (double/higher-order mutant). Previous studies have identified some characteristics common among redundant gene pairs, but a predictive model of genetic redundancy incorporating a wide variety of features derived from accumulating omics and mutant phenotype data is yet to be established. In addition, the relative importance of these features for genetic redundancy remains largely unclear. Here, we establish machine learning models for predicting whether a gene pair is likely redundant or not in the model plant Arabidopsis thaliana based on six feature categories: functional annotations, evolutionary conservation including duplication patterns and mechanisms, epigenetic marks, protein properties including posttranslational modifications, gene expression, and gene network properties. The definition of redundancy, data transformations, feature subsets, and machine learning algorithms used significantly affected model performance based on holdout, testing phenotype data. Among the most important features in predicting gene pairs as redundant were having a paralog(s) from recent duplication events, annotation as a transcription factor, downregulation during stress conditions, and having similar expression patterns under stress conditions. We also explored the potential reasons underlying mispredictions and limitations of our studies. This genetic redundancy model sheds light on characteristics that may contribute to long-term maintenance of paralogs, and will ultimately allow for more targeted generation of functionally informative double mutants, advancing functional genomic studies.

Authors

  • Siobhan A Cusack
    Cell and Molecular Biology Program, Michigan State University, East Lansing, MI, USA.
  • Peipei Wang
    Department of Plant Biology, Michigan State University, East Lansing, MI, USA.
  • Serena G Lotreck
    Department of Plant Biology, Michigan State University, East Lansing, MI, USA.
  • Bethany M Moore
    Department of Botany, University of Wisconsin-Madison, Madison, WI, USA.
  • Fanrui Meng
    Department of Cellular and Physiological Sciences, LSI Imaging, Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
  • Jeffrey K Conner
    Department of Plant Biology, Michigan State University, East Lansing, MI, USA.
  • Patrick J Krysan
    Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA.
  • Melissa D Lehti-Shiu
    Department of Plant Biology, Michigan State University, East Lansing, MI, USA.
  • Shin-Han Shiu
    Department of Plant Biology gustavoc@msu.edu shius@msu.edu.