CoCoScore: context-aware co-occurrence scoring for text mining applications using distant supervision.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Information extraction by mining the scientific literature is key to uncovering relations between biomedical entities. Most existing approaches based on natural language processing extract relations from single sentence-level co-mentions, ignoring co-occurrence statistics over the whole corpus. Existing approaches counting entity co-occurrences ignore the textual context of each co-occurrence.

Authors

  • Alexander Junge
    Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N 2200, Denmark.
  • Lars Juhl Jensen
    Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece, Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark, Max Planck Institute for Marine Microbiology, Bremen, Germany, Jacobs University gGmbH, School of Engineering and Sciences, Bremen, Germany, Marine Biological Laboratory, Woods Hole, MA 02543, USA and National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012, USA.