NMFGO: Gene Function Prediction via Nonnegative Matrix Factorization with Gene Ontology.

Journal: IEEE/ACM transactions on computational biology and bioinformatics
Published Date:

Abstract

Gene Ontology (GO) is a controlled vocabulary of terms that describe molecule function, biological roles, and cellular locations of gene products (i.e., proteins and RNAs), it hierarchically organizes more than 43,000 GO terms via the direct acyclic graph. A gene is generally annotated with several of these GO terms. Therefore, accurately predicting the association between genes and massive terms is a difficult challenge. To combat with this challenge, we propose an matrix factorization based approach called NMFGO. NMFGO stores the available GO annotations of genes in a gene-term association matrix and adopts an ontological structure based taxonomic similarity measure to capture the GO hierarchy. Next, it factorizes the association matrix into two low-rank matrices via nonnegative matrix factorization regularized with the GO hierarchy. After that, it employs a semantic similarity based k nearest neighbor classifier in the low-rank matrices approximated subspace to predict gene functions. Empirical study on three model species (S. cerevisiae, H. sapiens, and A. thaliana) shows that NMFGO is robust to the input parameters and achieves significantly better prediction performance than GIC, TO, dRW- kNN, and NtN, which were re-implemented based on the instructions of the original papers. The supplementary file and demo codes of NMFGO are available at http://mlda.swu.edu.cn/codes.php?name=NMFGO.

Authors

  • Guoxian Yu
    College of Computer and Information Science, Southwest University, Chongqing 400715, China Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China.
  • Keyao Wang
  • Guangyuan Fu
    College of Computer and Information Science, Southwest University, Chongqing 400715, China.
  • Maozu Guo
    School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
  • Jun Wang
    Department of Speech, Language, and Hearing Sciences and the Department of Neurology, The University of Texas at Austin, Austin, TX 78712, USA.