Mimvec: a deep learning approach for analyzing the human phenome.

Journal: BMC systems biology
PMID:

Abstract

BACKGROUND: The human phenome has been widely used with a variety of genomic data sources in the inference of disease genes. However, most existing methods thus far derive phenotype similarity based on the analysis of biomedical databases by using the traditional term frequency-inverse document frequency (TF-IDF) formulation. This framework, though intuitive, not only ignores semantic relationships between words but also tends to produce high-dimensional vectors, and hence lacks the ability to precisely capture intrinsic semantic characteristics of biomedical documents. To overcome these limitations, we propose a framework called mimvec to analyze the human phenome by making use of the state-of-the-art deep learning technique in natural language processing.

Authors

  • Mingxin Gan
    Department of Management Science and Engineering, Dongling School of Economics and Management, University of Science and Technology Beijing, Beijing, 100083, China.
  • Wenran Li
    Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Department of Automation and Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, 100084, China.
  • Wanwen Zeng
    MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST, Tsinghua University, Beijing, China.
  • Xiaojian Wang
    State Key Laboratory of Cardiovascular Disease, Fu Wai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100037, China.
  • Rui Jiang
    Department of Urology, The Affiliated Hospital of Southwest Medical University, Luzhou, China.