Visualization of medical concepts represented using word embeddings: a scoping review.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Analyzing the unstructured textual data contained in electronic health records (EHRs) has always been a challenging task. Word embedding methods have become an essential foundation for neural network-based approaches in natural language processing (NLP), to learn dense and low-dimensional word representations from large unlabeled corpora that capture the implicit semantics of words. Models like Word2Vec, GloVe or FastText have been broadly applied and reviewed in the bioinformatics and healthcare fields, most often to embed clinical notes or activity and diagnostic codes. Visualization of the learned embeddings has been used in a subset of these works, whether for exploratory or evaluation purposes. However, visualization practices tend to be heterogeneous, and lack overall guidelines.

Authors

  • Naima Oubenali
    Faculté Ingénierie et Management de la Santé, Univ. Lille, 59000, Lille, France. naimaoubenali@gmail.com.
  • Sabrina Messaoud
    Faculté Ingénierie et Management de la Santé, Univ. Lille, 59000, Lille, France.
  • Alexandre Filiot
    INCLUDE: Integration Center of the Lille University Hospital for Data Exploration, CHU Lille, 59000, Lille, France.
  • Antoine Lamer
    Faculté Ingénierie et Management de la Santé, Univ. Lille, 59000, Lille, France.
  • Paul Andrey
    INCLUDE: Integration Center of the Lille University Hospital for Data Exploration, CHU Lille, 59000, Lille, France.