deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable.

Authors

  • Ahmad Pesaranghader
    Faculty of Computer Science, Dalhousie University, Halifax, NS B3H 4R2, Canada, Institute for Big Data Analytics, Halifax, NS B3H 4R2, Canada.
  • Stan Matwin
    Faculty of Computer Science, Dalhousie University, Halifax, NS B3H 4R2, Canada, Institute for Big Data Analytics, Halifax, NS B3H 4R2, Canada, Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland and.
  • Marina Sokolova
    Institute for Big Data Analytics, Halifax, NS B3H 4R2, Canada, Faculty of Medicine and Faculty of Engineering, University of Ottawa, Ottawa, ON K1H 8M5, Canada.
  • Ali Pesaranghader
    School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada.