Joint representation of molecular networks from multiple species improves gene classification.

Journal: PLoS computational biology
Published Date:

Abstract

Network-based machine learning (ML) has the potential for predicting novel genes associated with nearly any health and disease context. However, this approach often uses network information from only the single species under consideration even though networks for most species are noisy and incomplete. While some recent methods have begun addressing this shortcoming by using networks from more than one species, they lack one or more key desirable properties: handling networks from more than two species simultaneously, incorporating many-to-many orthology information, or generating a network representation that is reusable across different types of and newly-defined prediction tasks. Here, we present GenePlexusZoo, a framework that casts molecular networks from multiple species into a single reusable feature space for network-based ML. We demonstrate that this multi-species network representation improves both gene classification within a single species and knowledge-transfer across species, even in cases where the inter-species correspondence is undetectable based on shared orthologous genes. Thus, GenePlexusZoo enables effectively leveraging the high evolutionary molecular, functional, and phenotypic conservation across species to discover novel genes associated with diverse biological contexts.

Authors

  • Christopher A Mancuso
    Department of Computational Mathematics Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
  • Kayla A Johnson
    Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America.
  • Renming Liu
    Department of Computational Mathematics, Science & Engineering, Michigan State University, East Lansing, MI 48824, USA.
  • Arjun Krishnan
    Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA; Departments of Computational Mathematics, Science, and Engineering and Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA.