A Case Study on Sepsis Using PubMed and Deep Learning for Ontology Learning.

Journal: Studies in health technology and informatics
Published Date:

Abstract

We investigate the application of distributional semantics models for facilitating unsupervised extraction of biomedical terms from unannotated corpora. Term extraction is used as the first step of an ontology learning process that aims to (semi-)automatic annotation of biomedical concepts and relations from more than 300K PubMed titles and abstracts. We experimented with both traditional distributional semantics methods such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) as well as the neural language models CBOW and Skip-gram from Deep Learning. The evaluation conducted concentrates on sepsis, a major life-threatening condition, and shows that Deep Learning models outperform LSA and LDA with much higher precision.

Authors

  • Mercedes Arguello Casteleiro
    School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK.
  • Diego Maseda Fernandez
    Midcheshire Hospital Foundation Trust, NHS England (UK).
  • George Demetriou
    School of Computer Science, University of Manchester, Oxford Road, M13 9PL Manchester, UK.
  • Warren Read
    School of Computer Science, University of Manchester (UK).
  • Maria Jesus Fernandez Prieto
    Salford Languages, University of Salford (UK).
  • Julio Des Diz
    Hospital do Salnés de Villagarcia, SERGAS (Spain).
  • Goran Nenadic
    School of Computer Science, University of Manchester, Manchester, UK.
  • John Keane
    School of Computer Science, University of Manchester (UK).
  • Robert Stevens
    School of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom. Electronic address: robert.stevens@manchester.ac.uk.