SapBERT-Based Medical Concept Normalization Using SNOMED CT.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Word vector representations, known as embeddings, are commonly used for natural language processing. Particularly, contextualized representations have been very successful recently. In this work, we analyze the impact of contextualized and non-contextualized embeddings for medical concept normalization, mapping clinical terms via a k-NN approach to SNOMED CT. The non-contextualized concept mapping resulted in a much better performance (F1-score = 0.853) than the contextualized representation (F1-score = 0.322).

Authors

  • Akhila Abdulnazar
    IMI, Medical University of Graz, Austria.
  • Markus Kreuzthaler
    Institute of Medical Informatics, Statistics, and Documentation, Medical University of Graz, Austria.
  • Roland Roller
    German Research Center for AI (DFKI).
  • Stefan Schulz
    Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.