Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs.

Journal: Journal of biomedical informatics
Published Date:

Abstract

INTRODUCTION: This article explores how measures of semantic similarity and relatedness are impacted by the semantic groups to which the concepts they are measuring belong. Our goal is to determine if there are distinctions between homogeneous comparisons (where both concepts belong to the same group) and heterogeneous ones (where the concepts are in different groups). Our hypothesis is that the similarity measures will be significantly affected since they rely on hierarchical is-a relations, whereas relatedness measures should be less impacted since they utilize a wider range of relations. In addition, we also evaluate the effect of combining different measures of similarity and relatedness. Our hypothesis is that these combined measures will more closely correlate with human judgment, since they better reflect the rich variety of information humans use when assessing similarity and relatedness.

Authors

  • Bridget T McInnes
    Department of Computer Science, Virginia Commonwealth University, 401 S. Main St., Rm E4225, Richmond, VA 23284, USA. Electronic address: btmcinnes@vcu.edu.
  • Ted Pedersen
    Department of Computer Science, University of Minnesota, 1114 Kirby Drive, Duluth, MN 55812, USA. Electronic address: tpederse@d.umn.edu.