Learning adaptive representations for entity recognition in the biomedical domain.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Named Entity Recognition is a common task in Natural Language Processing applications, whose purpose is to recognize named entities in textual documents. Several systems exist to solve this task in the biomedical domain, based on Natural Language Processing techniques and Machine Learning algorithms. A crucial step of these applications is the choice of the representation which describes data. Several representations have been proposed in the literature, some of which are based on a strong knowledge of the domain, and they consist of features manually defined by domain experts. Usually, these representations describe the problem well, but they require a lot of human effort and annotated data. On the other hand, general-purpose representations like word-embeddings do not require human domain knowledge, but they could be too general for a specific task.

Authors

  • Ivano Lauriola
    Department of Mathematics, University of Padova, Via Trieste 63, Padova, 35121, Italy. ivano.lauriola@phd.unipd.it.
  • Fabio Aiolli
    Dipartimento di Matematica "Tullio Levi-Civita," Università degli Studi di Padova, Via Trieste 63, 35121 Padova, Italy.
  • Alberto Lavelli
    HLT Research Unit, FBK, Trento, Italy. Electronic address: lavelli@fbk.eu.
  • Fabio Rinaldi
    University of Zurich, Institute of Computational Linguistics and Swiss Institute of Bioinformatics, Andreasstrasse 15, Zürich, CH-8050, Switzerland. rinaldi@cl.uzh.ch.