A representation model for biological entities by fusing structured axioms with unstructured texts.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Structured semantic resources, for example, biological knowledge bases and ontologies, formally define biological concepts, entities and their semantic relationships, manifested as structured axioms and unstructured texts (e.g. textual definitions). The resources contain accurate expressions of biological reality and have been used by machine-learning models to assist intelligent applications like knowledge discovery. The current methods use both the axioms and definitions as plain texts in representation learning (RL). However, since the axioms are machine-readable while the natural language is human-understandable, difference in meaning of token and structure impedes the representations to encode desirable biological knowledge.

Authors

  • Peiliang Lou
    Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.
  • YuXin Dong
    School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.
  • Antonio Jimeno Yepes
    IBM Research Australia, Melbourne, VIC, Australia. Electronic address: antonio.jimeno@au1.ibm.com.
  • Chen Li
    School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China.