Literature-based biomedical image classification and retrieval.

Journal: Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society
Published Date:

Abstract

Literature-based image informatics techniques are essential for managing the rapidly increasing volume of information in the biomedical domain. Compound figure separation, modality classification, and image retrieval are three related tasks useful for enabling efficient access to the most relevant images contained in the literature. In this article, we describe approaches to these tasks and the evaluation of our methods as part of the 2013 medical track of ImageCLEF. In performing each of these tasks, the textual and visual features used to represent images are an important consideration often left unaddressed. Therefore, we also describe a gradient-based optimization strategy for determining meaningful combinations of features and apply the method to the image retrieval task. An evaluation of our optimization strategy indicates the method is capable of producing statistically significant improvements in retrieval performance. Furthermore, the results of the 2013 ImageCLEF evaluation demonstrate the effectiveness of our techniques. In particular, our text-based and mixed image retrieval methods ranked first among all the participating groups.

Authors

  • Matthew S Simpson
    Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Daekeun You
    Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Md Mahmudur Rahman
    Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Zhiyun Xue
    Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Dina Demner-Fushman
    Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, MD.
  • Sameer Antani
    Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • George Thoma
    Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland, United States.