tESA: a distributional measure for calculating semantic relatedness.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in the ever growing domain of biomedical informatics. The problem of most state-of-the-art methods for calculating semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing. In this paper we present tESA, an extension to a well known Explicit Semantic Relatedness (ESA) method.

Authors

  • Maciej Rybinski
    Departamento LCC, University of Malaga, Campus Teatinos, Malaga, 29010, Spain.
  • José Francisco Aldana-Montes
    Departamento LCC, University of Malaga, Campus Teatinos, Malaga, 29010, Spain. jfam@lcc.uma.es.