Exploring relation types for literature-based discovery.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: Literature-based discovery (LBD) aims to identify "hidden knowledge" in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD.

Authors

  • Judita Preiss
    Department of Computer Science, The University of Sheffield 211 Portobello, Sheffield S1 4DP, UK j.preiss@sheffield.ac.uk.
  • Mark Stevenson
    Department of Computer Science, The University of Sheffield 211 Portobello, Sheffield S1 4DP, UK.
  • Robert Gaizauskas
    Department of Computer Science, The University of Sheffield 211 Portobello, Sheffield S1 4DP, UK.