BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Recent advances in representation learning have enabled large strides in natural language understanding; However, verbal reasoning remains a challenge for state-of-the-art systems. External sources of structured, expert-curated verb-related knowledge have been shown to boost model performance in different Natural Language Processing (NLP) tasks where accurate handling of verb meaning and behaviour is critical. The costliness and time required for manual lexicon construction has been a major obstacle to porting the benefits of such resources to NLP in specialised domains, such as biomedicine. To address this issue, we combine a neural classification method with expert annotation to create BioVerbNet. This new resource comprises 693 verbs assigned to 22 top-level and 117 fine-grained semantic-syntactic verb classes. We make this resource available complete with semantic roles and VerbNet-style syntactic frames.

Authors

  • Olga Majewska
    Language Technology Laboratory, MML, University of Cambridge, 9 West Road, Cambridge, CB39DB, UK.
  • Charlotte Collins
    Language Technology Laboratory, MMLL, University of Cambridge, 9 West Road, Cambridge, CB39DB, UK.
  • Simon Baker
    Language Technology Laboratory, TAL, University of Cambridge, Cambridge, United Kingdom.
  • Jari Björne
    TurkuNLP group, Department of Future Technologies, University of Turku, Turku, Finland.
  • Susan Windisch Brown
    Department of Linguistics, University of Colorado Boulder, 295 UCB, Boulder, 80309-0295, Colorado, USA.
  • Anna Korhonen
    Computer Laboratory, University of Cambridge, JJ Thompson Avenue, Cambridge, UK. alk23@cam.ac.uk.
  • Martha Palmer
    Department of Linguistics, University of Colorado at Boulder, Colorado, 80309-0295, USA.