Deep learning meets ontologies: experiments to anchor the cardiovascular disease ontology in the biomedical literature.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Automatic identification of term variants or acceptable alternative free-text terms for gene and protein names from the millions of biomedical publications is a challenging task. Ontologies, such as the Cardiovascular Disease Ontology (CVDO), capture domain knowledge in a computational form and can provide context for gene/protein names as written in the literature. This study investigates: 1) if word embeddings from Deep Learning algorithms can provide a list of term variants for a given gene/protein of interest; and 2) if biological knowledge from the CVDO can improve such a list without modifying the word embeddings created.

Authors

  • Mercedes Arguello Casteleiro
    School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK.
  • George Demetriou
    School of Computer Science, University of Manchester, Oxford Road, M13 9PL Manchester, UK.
  • Warren Read
    School of Computer Science, University of Manchester (UK).
  • Maria Jesus Fernandez Prieto
    Salford Languages, University of Salford (UK).
  • Nava Maroto
    Departamento de Lingüística Aplicada a la Ciencia y a la Tecnología, Universidad Politécnica de Madrid, Madrid, Spain.
  • Diego Maseda Fernandez
    Midcheshire Hospital Foundation Trust, NHS England (UK).
  • Goran Nenadic
    School of Computer Science, University of Manchester, Manchester, UK.
  • Julie Klein
    Institut National de la Sante et de la Recherche Medicale (INSERM), U1048, Toulouse, 24105, France.
  • John Keane
    School of Computer Science, University of Manchester (UK).
  • Robert Stevens
    School of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom. Electronic address: robert.stevens@manchester.ac.uk.