GOF/LOF knowledge inference with tensor decomposition in support of high order link discovery for gene, mutation and disease.

Journal: Mathematical biosciences and engineering : MBE
PMID:

Abstract

For discovery of new usage of drugs, the function type of their target genes plays an important role, and the hypothesis of "Antagonist-GOF" and "Agonist-LOF" has laid a solid foundation for supporting drug repurposing. In this research, an active gene annotation corpus was used as training data to predict the gain-of-function or loss-of-function or unknown character of each human gene after variation events. Unlike the design of(entity, predicate, entity) triples in a traditional three way tensor, a four way and a five way tensor, GMFD-/GMAFD-tensor, were designed to represent higher order links among or among part of these entities: genes(G), mutations(M), functions(F), diseases( D) and annotation labels(A). A tensor decomposition algorithm, CP decomposition, was applied to the higher order tensor and to unveil the correlation among entities. Meanwhile, a state-of-the-art baseline tensor decomposition algorithm, RESCAL, was carried on the three way tensor as a comparing method. The result showed that CP decomposition on higher order tensor performed better than RESCAL on traditional three way tensor in recovering masked data and making predictions. In addition, The four way tensor was proved to be the best format for our issue. At the end, a case study reproducing two disease-gene-drug links(Myelodysplatic Syndromes-IL2RA-Aldesleukin, Lymphoma- IL2RA-Aldesleukin) presented the feasibility of our prediction model for drug repurposing.

Authors

  • Kai Yin Zhou
    College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China.
  • Yu Xing Wang
    College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China.
  • Sheng Zhang
    Department of Critical Care Medicine, Taizhou Hospital of Zhejiang Province, Wenzhou Medical University, Taizhou, China.
  • Mina Gachloo
    College of Science, Huazhong Agricultural University, 430070, Wuhan, China.
  • Jin Dong Kim
    Database Center for Life Science (DBCLS), Research Organization of Information and Systems (ROIS), Tokyo, Japan.
  • Qi Luo
    B-DAT & CICAEET, School of Information and Control, Nanjing University of Information Science and Technology, Nanjing 210044, PR China.
  • Kevin Bretonnel Cohen
    Computational Bioscience Program, University Colorado School of Medicine, Aurora, CO, USA.
  • Jing Bo Xia
    College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China.