Acronym Disambiguation in Spanish Electronic Health Narratives Using Machine Learning Techniques.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Electronic Health Records (EHRs) are now being massively used in hospitals what has motivated current developments of new methods to process clinical narratives (unstructured data) making it possible to perform context-based searches. Current approaches to process the unstructured texts in EHRs are based in applying text mining or natural language processing (NLP) techniques over the data. In particular Named Entity Recognition (NER) is of paramount importance to retrieve specific biomedical concepts from the text providing the semantic type of the concept retrieved. However, it is very common that clinical notes contain lots of acronyms that cannot be identified by NER processes and even if they are identified, an acronym may correspond to several meanings, so disambiguation of the found term is needed. In this work we provide an approach to perform acronym disambiguation in Spanish EHR using machine learning techniques.

Authors

  • Ignacio Rubio-López
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
  • Roberto Costumero
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
  • Héctor Ambit
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
  • Consuelo Gonzalo-Martín
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
  • Ernestina Menasalvas
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
  • Alejandro Rodríguez González
    Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.