Preliminary Analysis of Difficulty of Importing Pattern-Based Concepts into the National Cancer Institute Thesaurus.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Maintenance of biomedical ontologies is difficult. We have developed a pattern-based method for dealing with the problem of identifying missing concepts in the National Cancer Institute thesaurus (NCIt). Specifically, we are mining patterns connecting NCIt concepts with concepts in other ontologies to identify candidate missing concepts. However, the final decision about a concept insertion is always up to a human ontology curator. In this paper, we are estimating the difficulty of this task for a domain expert by counting possible choices for a pattern-based insertion. We conclude that even with support of our mining algorithm, the insertion task is challenging.

Authors

  • Zhe He
    School of Information, Florida State University, Tallahassee, FL, USA.
  • James Geller
    Dept of Computer Science, NJIT, Newark, NJ, USA.