Improving dictionary-based named entity recognition with deep learning.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Dictionary-based named entity recognition (NER) allows terms to be detected in a corpus and normalized to biomedical databases and ontologies. However, adaptation to different entity types requires new high-quality dictionaries and associated lists of blocked names for each type. The latter are so far created by identifying cases that cause many false positives through manual inspection of individual names, a process that scales poorly.

Authors

  • Katerina Nastou
    Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen, 2200, Denmark.
  • Mikaela Koutrouli
    Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen, 2200, Denmark.
  • Sampo Pyysalo
  • Lars Juhl Jensen
    Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece, Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark, Max Planck Institute for Marine Microbiology, Bremen, Germany, Jacobs University gGmbH, School of Engineering and Sciences, Bremen, Germany, Marine Biological Laboratory, Woods Hole, MA 02543, USA and National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012, USA.